The {avocado} package consists of three different datasets that summarize the weekly sales (units) of Hass Avocados at different regional levels.
-
hass_usa
: weekly contiguous US avocado sales at the country level -
hass_region
: weekly contiguous US avocado sales at the region level -
hass_market
: weekly contiguous US avocado sales at the city/sub-region level
Units
Throughout the datasets, you’ll see the term units. Think of a unit as 1 avocado. The Hass Avocado Board (HAB) provides insights on the unit sales of avocados. This includes bags. In terms of bags, 1 unit still refers to 1 avocado. A bag (of any size) may consist of multiple units.
Rounding
The raw dataset that is provided by HAB typically includes fractional units. This does not imply that fractions of avocados were sold. Rather, the underlying data comes from external sources. These sources can provide fractional units depending on how they count units. For example, partial sales could result in fractional units. For the datasets in this package, all values have been rounded up to the nearest whole number.
See the HAB website for their summarized reports and informational dashboards.
PLU
The product/price lookup code (PLU) uniquely identifies a product (mainly produce). The Hass Avocado Board focuses on six different PLUs:
- 4046: non-organic small/medium Hass Avocados (~3-5 oz); also known as Hass #60 size or smaller
- 4225: non-organic large Hass Avocados (~8-10 oz); also known as Hass #40 size and Hass #48 size
- 4770: non-organic extra large Hass Avocados (~10-15 oz); also known as Hass #36 size or larger
Organic avocados have the digit 9 prefixed to the non-organic PLUs: * 94046: organic small/medium Hass Avocados (~3-5 oz) * 94225: organic large Hass Avocados (~8-10 oz) * 94770: organic extra large Hass Avocados (~10-15 oz)
Within this dataset, you’ll want to use the type
column
combined with the column plu4046_units
,
plu4225_units
, and plu4770_units
to determine
if the units are for conventional (non-organic) or organic avocados. For
example, if the type is Organic and you look at the value in
plu4046_units
, you’ll actually be looking at the unit sales
for organic avocados with PLU 94046.
Bags vs PLU
Another distinction that the HAB makes is between bags versus bulk. Bulk typically means avocados sold as individual pieces and are easily distinguishable with their PLU codes. Hence, the PLU refers to a bulk sale. On the other hand, the bags indicates a pre-packaged container consisting of a variable number of avocados that could weigh differently.
Region vs. Location
The hass_region
and hass
datasets contain a
shared variable called region
and the hass
dataset has a variable called location
. Regions are defined
by the Hass Avocado Board and Locations are selected cities or
sub-regions that are part of the overall Region. The totals found for
all locations within a Region will not equal the total found for the
specific Region due to the aforementioned point. For convenience, here
is a breakdown of the Regions and Locations:
- California
- West
- Washington
- Oregon
- Idaho
- Nevada
- Montana
- Utah
- Arizona
- Wyoming
- Colorado
- New Mexico
- Plains
- North Dakota
- South Dakota
- Nebraska
- Kansas
- Minnesota
- Iowa
- Missouri
- South Central
- Texas
- Oklahoma
- Arkansas
- Louisiana
- Southeast
- Mississippi
- Alabama
- Georgia
- South Carolina
- Florida
- Midsouth
- Kentucky
- Tennessee
- North Carolina
- West Virginia
- Virginia
- Maryland
- Delaware
- Great Lakes
- Wisconsin
- Illinois
- Michigan
- Indiana
- Ohio
- Northeast
- Pennsylvania
- New York
- Vermont
- New Hampshire
- Massachusetts
- Connecticut
- Rhode Island
- New Jersey
- Maine
Datasets
hass_usa
The hass_usa
dataset focuses on weekly Hass Avocado
sales at the country (i.e., contiguous US) level and consists of the
following fields:
-
week_ending
: The date of the last day of the week in YYYY-MM-DD format. -
type
: Whether it’s non-organic (Conventional) or organic. -
avg_selling_price
: The Average Selling Price. This is a derived value (by HAB) that looks at total dollar sales divided by total units sold. It is not the advertised selling price (e.g., the price you may see in stores). -
total_bulk_and_bags_units
: The total number of avocados sold. This includes avocados sold individually (i.e., bulk) or in bags. -
plu4046_units
: The total number of avocados sold. For non-organic, consider this to be PLU 4046 and ensure that the value in the tpe column isConventional
. For organic, consider this to be PLU 94046 and ensure that the value in the type column isOrganic
. -
plu4225_units
: The total number of avocados sold. For non-organic, consider this to be PLU 4225 and ensure that the value in the tpe column isConventional
. For organic, consider this to be PLU 94225 and ensure that the value in the type column isOrganic
. -
plu4770_units
: The total number of avocados sold. For non-organic, consider this to be PLU 4770 and ensure that the value in the tpe column isConventional
. For organic, consider this to be PLU 94770 and ensure that the value in the type column isOrganic
. -
total_bagged_units
: The total number of avocados sold in bags. This is not the number of bags sold. -
sml_bagged_units
: The total number of avocados sold in small bags. This value is no longer available from 2021 onwards. -
lrg_bagged_units
: The total number of avocados sold in large bags. This value is no longer available from 2021 onwards. -
xlrg_bagged_units
: The total number of avocados sold in extra large bags. This value is no longer available from 2021 onwards.
library(avocado)
data('hass_usa')
dplyr::glimpse(hass_usa)
#> Rows: 810
#> Columns: 11
#> $ week_ending <date> 2017-01-02, 2017-01-08, 2017-01-15, 2017-01…
#> $ type <chr> "Conventional", "Conventional", "Conventiona…
#> $ avg_selling_price <dbl> 0.89, 0.99, 0.98, 0.94, 0.96, 0.77, 0.87, 0.…
#> $ total_bulk_and_bags_units <dbl> 38879717, 38049803, 38295489, 42140394, 3937…
#> $ plu4046_units <dbl> 12707895, 11809728, 12936859, 14254151, 1403…
#> $ plu4225_units <dbl> 14201201, 13856935, 12625666, 14212882, 1168…
#> $ plu4770_units <dbl> 549845, 539069, 579347, 908617, 818728, 1664…
#> $ total_bagged_units <dbl> 11420777, 11844072, 12153619, 12764745, 1283…
#> $ sml_bagged_units <dbl> 8551134, 9332972, 9445623, 9462854, 9918256,…
#> $ lrg_bagged_units <dbl> 2802710, 2432260, 2638919, 3231020, 2799961,…
#> $ xlrg_bagged_units <dbl> 66934, 78841, 69078, 70872, 119096, 112870, …
haas_region
The hass_region
dataset focuses on weekly US Hass
Avocado sales at the region level and consist of the following
fields:
-
region
: Specific region within the US as defined by the Hass Avocado Board. -
week_ending
: The date of the last day of the week in YYYY-MM-DD format. -
type
: Whether it’s non-organic (Conventional) or organic. -
avg_selling_price
: The Average Selling Price. This is a derived value (by HAB) that looks at total dollar sales divided by total units sold. It is not the advertised selling price (e.g., the price you may see in stores). -
total_bulk_and_bags_units
: The total number of avocados sold. This includes avocados sold individually (i.e., bulk) or in bags. -
plu4046_units
: The total number of avocados sold. For non-organic, consider this to be PLU 4046 and ensure that the value in the tpe column isConventional
. For organic, consider this to be PLU 94046 and ensure that the value in the type column isOrganic
. -
plu4225_units
: The total number of avocados sold. For non-organic, consider this to be PLU 4225 and ensure that the value in the tpe column isConventional
. For organic, consider this to be PLU 94225 and ensure that the value in the type column isOrganic
. -
plu4770_units
: The total number of avocados sold. For non-organic, consider this to be PLU 4770 and ensure that the value in the tpe column isConventional
. For organic, consider this to be PLU 94770 and ensure that the value in the type column isOrganic
. -
total_bagged_units
: The total number of avocados sold in bags. This is not the number of bags sold. -
sml_bagged_units
: The total number of avocados sold in small bags. This value is no longer available from 2021 onwards. -
lrg_bagged_units
: The total number of avocados sold in large bags. This value is no longer available from 2021 onwards. -
xlrg_bagged_units
: The total number of avocados sold in extra large bags. This value is no longer available from 2021 onwards.
library(avocado)
data('hass_region')
dplyr::glimpse(hass_region)
#> Rows: 6,480
#> Columns: 12
#> $ region <chr> "California", "Great Lakes", "Midsouth", "No…
#> $ week_ending <date> 2017-01-02, 2017-01-02, 2017-01-02, 2017-01…
#> $ type <chr> "Conventional", "Conventional", "Conventiona…
#> $ avg_selling_price <dbl> 0.89, 0.88, 1.12, 1.35, 0.83, 0.64, 0.94, 0.…
#> $ total_bulk_and_bags_units <dbl> 7175277, 4225246, 2878968, 3513389, 2382743,…
#> $ plu4046_units <dbl> 2266314, 636278, 653896, 174843, 1462455, 35…
#> $ plu4225_units <dbl> 2877689, 2157250, 1285365, 2589316, 509660, …
#> $ plu4770_units <dbl> 90900, 189357, 64704, 39607, 4781, 27549, 92…
#> $ total_bagged_units <dbl> 1940376, 1242362, 875005, 709624, 405849, 13…
#> $ sml_bagged_units <dbl> 1762034, 885770, 719380, 659612, 387098, 110…
#> $ lrg_bagged_units <dbl> 151334, 349033, 151227, 49533, 13009, 230435…
#> $ xlrg_bagged_units <dbl> 27008, 7560, 4399, 479, 5743, 16335, 1884, 3…
hass_market
The hass_market
dataset summarizes weekly Hass Avocado
sales within the contiguous US based on city or sub-region. These areas
are defined by the HAB and make up portions of the region
field in the haas_region
dataset. The fields are:
-
region
: Specific region within the US as defined by the Hass Avocado Board. -
market
: The market within the specified region of the United States. This market typically represents a major metropolitan city (or cities) reporting the highest sales. -
week_ending
: The date of the last day of the week in YYYY-MM-DD format. -
type
: Whether it’s non-organic (Conventional) or organic. -
avg_selling_price
: The Average Selling Price. This is a derived value (by HAB) that looks at total dollar sales divided by total units sold. It is not the advertised selling price (e.g., the price you may see in stores). -
total_bulk_and_bags_units
: The total number of avocados sold. This includes avocados sold individually (i.e., bulk) or in bags. -
plu4046_units
: The total number of avocados sold. For non-organic, consider this to be PLU 4046 and ensure that the value in the tpe column isConventional
. For organic, consider this to be PLU 94046 and ensure that the value in the type column isOrganic
. -
plu4225_units
: The total number of avocados sold. For non-organic, consider this to be PLU 4225 and ensure that the value in the tpe column isConventional
. For organic, consider this to be PLU 94225 and ensure that the value in the type column isOrganic
. -
plu4770_units
: The total number of avocados sold. For non-organic, consider this to be PLU 4770 and ensure that the value in the tpe column isConventional
. For organic, consider this to be PLU 94770 and ensure that the value in the type column isOrganic
. -
total_bagged_units
: The total number of avocados sold in bags. This is not the number of bags sold. -
sml_bagged_units
: The total number of avocados sold in small bags. This value is no longer available from 2021 onwards. -
lrg_bagged_units
: The total number of avocados sold in large bags. This value is no longer available from 2021 onwards. -
xlrg_bagged_units
: The total number of avocados sold in extra large bags. This value is no longer available from 2021 onwards.
library(avocado)
data('hass_market')
dplyr::glimpse(hass_market)
#> Rows: 38,522
#> Columns: 13
#> $ region <chr> "Northeast", "Southeast", "Midsouth", "West"…
#> $ market <chr> "Albany", "Atlanta", "Baltimore/Washington",…
#> $ week_ending <date> 2017-01-02, 2017-01-02, 2017-01-02, 2017-01…
#> $ type <chr> "Conventional", "Conventional", "Conventiona…
#> $ avg_selling_price <dbl> 1.47, 0.93, 1.47, 0.92, 1.29, 1.43, 1.21, 1.…
#> $ total_bulk_and_bags_units <dbl> 129949, 547566, 631761, 104511, 458831, 1053…
#> $ plu4046_units <dbl> 4846, 224074, 54531, 27846, 4120, 1286, 4776…
#> $ plu4225_units <dbl> 117028, 118927, 408953, 9409, 371224, 58532,…
#> $ plu4770_units <dbl> 201, 338, 14388, 11342, 3934, 103, 15037, 11…
#> $ total_bagged_units <dbl> 7875, 204229, 153892, 55915, 79554, 45430, 5…
#> $ sml_bagged_units <dbl> 7867, 111600, 151346, 53094, 79340, 45156, 4…
#> $ lrg_bagged_units <dbl> 8, 92629, 2543, 2794, 214, 256, 13712, 1079,…
#> $ xlrg_bagged_units <dbl> 0, 0, 4, 28, 0, 19, 47, 5090, 2, 0, 917, 98,…