arkansas_restructR Documentation

Process an image multiple times to increase chance of successful extraction

Description

Sometimes an image cannot be extracted with the numbers placed in a cell with the default values for extraction. In this case we loop over several values of extraction parameters to try and get the cell values.

Usage

arkansas_restruct(img, enhancements = seq(100, 500, by = 50))

Arguments

enhancements

list of enhancements to use for extraction

url_

the url of the image

Value

data frame with values extracted from image

R: Scraper class for general Arkansas COVID data
arkansas_scraperR Documentation

Scraper class for general Arkansas COVID data

Description

AR data is downloaded from a new weekly image which undergoes OCR. Can be temperamental. Check logs frequently as AR has changed the data reported twice in the past. Need to combine Residents Recovered with Residents Positive Not Recovered to get total confirmed.

Residents Tested

Residents Tested.

Residents Recovered

Residents Recovered.

Residents Positive Not Recovered

Residents Positive Not Recovered.

Staff Tested

Staff Tested. No Longer Reported.

Staff Recovered

Staff Recovered. No longer reported.

Staff Positive Not Recovered

Staff Positive Not Recovered.

Details

The last run of the scraper was on 2021-01-11 and contained the extracted columns: Residents.Tested, Residents.Recovered, Name, Residents.Active, Residents.Confirmed, State, Date, id, source, jurisdiction. We are missing the following core variables for the analysis: Staff.Confirmed, Staff.Deaths, Residents.Deaths, Staff.Recovered, Staff.Tested, Residents.Tadmin, Staff.Negative, Residents.Negative, Staff.Pending, Residents.Pending, Staff.Quarantine, Residents.Quarantine, Residents.Population

Super class

R_GlobalEnv::generic_scraper -> arkansas_scraper

Methods

Public methods

Inherited methods

Method new()

Usage
arkansas_scraper$new(
  log,
  url = "https://doc.arkansas.gov/covid-19-updates/",
  id = "arkansas",
  state = "AR",
  type = "img",
  jurisdiction = "state",
  pull_func = function(x) {     magick::image_read(get_src_by_attr(base = x, css = "a",
    attr = "href", attr_regex = "(?i)stats.?update", date_regex =
    "\\d+-\\d+-\\d+")) },
  restruct_func = function(x, ...) arkansas_restruct(x, ...),
  extract_func = arkansas_extract
)

Method clone()

The objects of this class are cloneable with this method.

Usage
arkansas_scraper$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

R: Process an image multiple times to increase chance of...
arkansas_extractR Documentation

Process an image multiple times to increase chance of successful extraction

Description

Sometimes an image cannot be extracted with the numbers placed in a cell with the default values for extraction. In this case we loop over several values of extraction parameters to try and get the cell values.

Usage

arkansas_extract(x)

Arguments

url_

the url of the image

enhancements

list of enhancements to use for extraction

Value

data frame with values extracted from image

R: Crop out a cell in the AR image and extract the numeric value
process_AR_cellR Documentation

Crop out a cell in the AR image and extract the numeric value

Description

the data for AR numbers come from a PNG with consistent formatting. the goal of this function is to isolate the cell with a single numeric value using the crop argument, enhance the image, and then extract the number value using OCR. Note that converting to gray-scale will help this process and if we can invert the colors that would probably be even better. enhancement

Usage

process_AR_cell(base_image, crop, enhancement = 500)

Arguments

base_image

image loaded into R using the magick package

crop

the area to crop the image

enhancement

what size to blow up the image to

Value

numeric value of specified cell.

R: Extract information from all cells in an AR image.
process_AR_imageR Documentation

Extract information from all cells in an AR image.

Description

Get all information from an AR image cells and place in tibble. NOTE: THIS ONLY WORKS WIH A PARTICULAR IMAGE FORMAT AND ANOTHER SET OF CROPPPING AREAS WILL LIKELY BE NEEDED IF THIS CHANGES.

Usage

process_AR_image(base_image, ...)

Arguments

...

other arguments to pass to process_AR_cell

image

loaded into R using the magick package

Value

data frame with values extracted from image