%%bash
bash /etl/dfp/gcs/gcs_list_files.sh
dfp_logs = pd.read_table('/ebs/dfp_logs_gcs.txt', header=None)
dfp_logs[0].apply(lambda x: x.split('/')[3].split('_')[0]).drop_duplicates()
0 NetworkActiveViews 1455 NetworkActivities 2901 NetworkBackfillActiveViews 4356 NetworkBackfillClicks 5811 NetworkBackfillImpressions 7266 NetworkBackfillRichMediaConversions 8667 NetworkBackfillVideoConversions 10122 NetworkClicks 11577 NetworkImpressions 13032 NetworkVideoConversions Name: 0, dtype: object
What are we doing: Using the location data to create insights and products that can be monetized with businesses (both advertising and non-advertising) · Understanding and exploration of location data
· Building a places database
· Understanding patterns and segmenting customers based on types of businesses/places visited
· Predict users who are going to visit a new business based on places visited and demographics/other third party
· Visualization for sales(movement of users across the day – traffic to a Walmart/coffee shop by time of day/day of week) and for the analytics we are currently doing
Data we use: · Location Data
· Places Data
· Exploring the use of other data sources (within and outside the company)
o Factual/Lotame Segments
o User Information that we collect
Tools: · Python
· ArcGIS
Some Challenges: We are currently performing this using one day of location data. Some capabilities that we need to develop · Scaling these analyses across larger data
· Learning and using geospatial libraries in Python instead of using standalone geospatial tools (ArcGIS Integration, QGIS)
· Integration of other data sources to the location and places data