As a free benefit for participants, we would like to extend an invitation to the Amazon SageMaker workshop on Tue, Apr 24 from 2:30-4:30pm.LEARN MORE
Welcome to the Retail vertical page! In order to make the most of the time the weekend of the event, please review our key educational materials and data sets.
Be Prepared! Start thinking through what types of data could power your business and product ideas. Often times a combination of multiple, disparate data sets can yield the most ingenious ideas and solutions!
The following videos were recording during the April 19 Panel event. You may wish to reference them in preparation of the weekend ML event.
The retail market includes any sale of products to consumers. From eCommerce, Grocery, Gasoline, Ikea, to Stitch Fix. How can we utilize technology to improve the shopping experience?
The US retail market is worth $3.5 trillion and growing at 3%. eCommerce only has 10% market share of total sales. The smart retail market is worth $10B but is growing at 24%. Grocery is a $663 billion market in the US. Growth is driven by the adoption of smartphones and changing customer demands (customer service, delivery, one click).
There are several different types of technology used in smart retail. Bluetooth, ZigBee, RFID, WIFI, VR/AR, LPWAN. You can also segment it by application: smart label, store navigation, smart payments, robotics, analytics, visual marketing. The largest market size among categories is in hardware as computing power will limit adoption of new technologies and require upgrading. Robotics is estimated to have the most growth among segments.
64% of retailers are selling omnichannel in the US. 40% of retailers plan to increase their spend on technology. There is still much more data to take advantage of from shoppers.
Purple by Global Partner Acq ($1.1B) - new age mattresses
Blue Nile by Bain/Bow Street ($500M)
Chewy ($451M raised)
Brandless ($293M raised)
Dollar Shave Club ($163M raised)
Standard Cognition ($5.5M seed) - get rid of cashiers
Fourpost ($5M seed) - community experience with brands
Grailed ($15m series A) - peer to peer NYC marketplace
How can we improve the shopping experience?
Why are there ever lines to pay for items?
So much manual labor and operation costs
Trying items on takes too long and cannot be done online?
How can I try out online products?
Can I 3D print products at home?
Can technology improve my style?
How can we better navigate stores?
Smartphone adoption/ Mobile Purchases
Social Media as a marketplace
IoT - RFIDs, Beacons, Sensors and cheap tech
Reimagined Brick and Mortar
Customer controlled delivery/search - Buy online, pickup in store, Amazon Day, Same Day
Personalized/Predictive Experiences - Yoox Net-A-Porter
Subscriptions - Stitch Fix, BabyGap OutfitBox, Hasbro Party Crate
Augmented Reality - Wayfair, Lowes, Salon Project
Messaging with customers - Facebook Messenger
Checkout Automation - Cashless, Amazon Go
Optimize store layouts
Speed up necessary shopping
Bring new products to consumers
Integrating different systems
Automate labor - Sales assistant, cashier, returns processing
Improve ad experience
The retail market is so large and valuable there are many right tailed opportunities. We should be able to identify interesting spaces to innovate and capture portions of this expansive market.
Where do we start?
No barrier to entry, price competition
Your novel business idea should be grounded in real-world data with plausible machine-learning/analytics on top. We've compiled a collection of datasets from which to gain inspiration. Note that you are not restricted to basing your idea on the data sets below. You may discover other open source data sets that inspire your creativity or you may bring your own proprietary data sets if you wish.
Many of the datasets below are from Kaggle, Figure-Eight (Crowdflower), Data.World, etc. The advantage of these datasets is that many have been cleaned and normalized and are ready to be explored with ML and data science tools. Note that the use of these datasets is often intended for research purposes only. Be sure to read any associated license agreements to understand if there are commercial restrictions if you plan to continuing using the data after the workshop is over.
This dataset contains product reviews and metadata from Amazon, including 142.8 million reviews spanning May 1996 - July 2014
Idea: Can you design a model that learns what a high-quality, specific, helpful review looks like, and give you real-time suggestions as you write reviews?
Idea 2: Can you mine the review data to pull out the three things people most like and the three things people most dislike about various products?
This dataset contains Question and Answer data from Amazon, totaling around 1.4 million answered questions.
A large collection of Amazon and Yelp reviews, plus Yahoo Answers data.
Idea: Design a review summarizer that summarizes the positive and negative reviews for a product to allow users to quickly understand overall review sentiment from users.
Labeled tweets about multiple brands and products. (originating page found here)
Gengo scoured the internet to gather a list of publicly available ecommerce datasets for machine learning projects. Enjoy!
Access and analyze open grocery data for Canada.