Scroll Depth Tracking Analysis with Google Analytics R

scroll_depth_google_analytics_percent_page_viewed_R

71% of pageviews to my post on automated Google Analytics cost import scroll 50% of the page and 41% of pageview reach the comments section. This is a tutorial on how to make the scroll depth tracking report above using the googleAnalyticsR package.

Scroll depth reporting gives you insight into how users are engaging with your content. How far down the page do visitors scroll? With the out of the box Google Analytics implementation there is no way to know.

Scroll Depth Tracking Implementation

One plug-in that you can implement to measure the percent a user scrolls down your page is Scroll Depth from parsnip.io I recommend implementing the scroll depth plug-in via Google Tag Manager. Here is a recent post on how to implement scroll tracking via GTM. On my blog I track 25%, 50%, 75%, and 100% percent page viewed. I also track when the user reaches the comments section of the page. If you need help with implementing the plug-in let me know. The rest of the post is about how to best report on and analyze the page scroll depth data.

Scroll Depth Reporting

I found a lot written on how to implement scroll depth tracking, but very little written about how to report on and analyze scroll depth data. The scroll data is tracked using Google Analytics events. When you use events in Google Analytics to measure multiple site actions it can be challenging to report on. The Google Analytics event category, event action, event labels are report “dimensions” (not to be confused with Custom Dimensions) and the total events and unique events are report “metrics” (not to be confused with Custom Metrics). For the page scroll depth reporting 25%, 50%, 75%, 100% are captured in the event action “dimension” and the count of the occurrence of these percentages are captured in the total events “metric”.

But the question I really want to answer is how often do each 25%, 50%, 75%, 100% scroll depth get reached on each page? I want page to be my report dimension and I want the percentage of pageviews reaching 25%, 50%, 75%, 100% to by my report metrics.

Here is how the event reporting looks in the Google Analytics interface:

scroll_depth_google_analytics_events_ga

Here is how the same scroll depth reporting looks after transforming the data in R:

scroll_depth_google_analytics_percent_page_viewed_R

Looking at the scroll depth report generated in R above, you can see that the data is sorted by pageviews for the date range. The top page had 2569 pageviews. 90% of those pageviews scrolled down 25% of the page. 27% of the those pageviews continued scrolling and reach the comments (shown as percent_disqus in the report). The following section will describe how to run this R Script. No previous R experience is necessary. I’ll walk you through each step. Let me know if you have any questions.

Scroll Depth Report R Script Tutorial

1) Download and Install R.

2) Install R Studio.

3) Launch R Studio and install the R packages: googleAnalyticsR & tidyr. In the Console module in R Studio (the bottom left pane) run the code below.

When the packages install you’ll see a message in the console that says “package successfully installed…”

4) Save or copy the script below to your computer and open it in RStudio.

5) Run the pageScrollDepth.R Script

If this is your first time running the googleAnalyticsR package you’ll need to authorize the script. Add your Google Analytics viewID on row 13 of the script. Make sure you are logged into your account with access to Google Analytics on your web browser. When you run the R script a Google webpage will open prompting you to choose your account.

googleAnalyticsR_account_authorization

Then you’ll need to allow access to googleAuthR as shown below.

googleAnalyticsR_account_authorization_2

Once the authentication is complete you should see a message saying “Authentication complete. Please close this page and return to R.”

googleAnalyticsR_account_authorization_3

6) Retrieve your Google Analytics viewID in R

On rows 8 through 10 of the script retrieves a list of all the accounts that you have access to via the Google Analytics Management API. Then you are shown a view of the dataframe with this info called my_accounts. You can find the viewId in my_accounts for the Google Analytics view that you’ll use in the script without even opening the Google Analytics web UI!

7) googleAnalyticsR Scroll Depth Queries

Rows 15 through 27 include the googleAnalyticsR queries. The first query on row 15 pulls pageviews for pages and the second query on row 21 pulls the scroll depth event data by page. In your query make sure to specify your date_range on rows 17 and 23. The filtersExpression on row 26 includes #disqus because I track when visitors reach the comment section which has a div id=disqus_thread. Remove or change this filtersExpression if you don’t use disqus.

8) tidyr spread() Transforms Scroll Depth Data

The original Google Analytics R event query dataframe df is shown below.

scroll_depth_google_analytics_event_data_query_R

The query data is transformed with the spread() function from the tidyr package.

In the spread() function the data is df which is the dataframe with the original event query data. The eventLabel is the key which has the page scroll depth data that we want broken out in multiple columns for each 25%, 50%, 75%, 100%, #disqus_thread.  The totalEvents is the value which will fill in the values for the multiple columns with scroll depth. The fill=0 fills in 0 for NAs.

The transformed dataframe df2 is shown below. This transformation takes the long data Google Analytics event data and makes it wide. You can read more about long and wide data here.

scroll_depth_google_analytics_event_data_query_transformed_tidyr_R

9) Merge Transformed Scroll Depth Data with Page Views Data

10) Calculate the Percentage of People that Reach Each Scroll Depth for Each Page

Showing the percent of total pageviews that reach each page scroll depth makes it easy to compare across pages with different numbers of pageviews. For example 100 people reaching the 50% scroll depth of the homepage is pretty meaningless. But 36% of the total pageviews to the homepage reaching the 50% scroll depth is a lot more meaningful. It also makes it easier to compare this percent across pages. For example 84% of the total pageviews to the about page reach the 50% scroll depth which is clearly greater than the homepage.

The code above creates new calculated columns for the percent of pageviews reaching each page depth and adds them to the dataframe df3. It rounds the percentage to 2 digits.

11) Cleanup the Page Scroll Depth Report

The final step is to clean up the scroll depth report to only show the page as the dimension the pageviews as one metrics and then percentage of pageviews reaching each scroll depth as the other metrics. This is captured in dataframe df4. Then you sort the data in descending order of pageviews which is captured in dataframe df5.

12) Your Page Scroll Depth Report

Run >View(df5) in the R console to see the final page scroll depth report below.

 scroll_depth_google_analytics_percent_page_viewed_R

Use the Built In Google Analytics API v4 Pivot Functionality

With the new Google Analytics v4 reporting API there is a built in pivot table functionality. Rather than the spread() function from the tidyr package in step 8 you could use the built in pivot directly from the API. Below is a code example of how to use the pivot table functionality to pull the pivoted data directly from the Google Analytics API.

If you have any questions or think of other great ways to analyze and report on Google Analytics scroll depth data let me know in the comments.

  • Thomas Bosilevac

    This is great! For blogs, at MashMetrics we have standardized on 15 goals to look at page scroll, video tracking, time on page, etc.
    Since we have dozens of sites on the same metrics standard, this could be used to see patters across accounts.

    While we setup a scroll goal, this is also a great way to see the entire progression.

    Very cool!

    • Thanks for sharing Thomas. Glad you found the scroll depth tracking analysis useful. R is a great way to automate this kind analysis for multiple accounts across various metrics. What are some of the other 15 standard goals you setup?

      -Ryan

      • Thomas Bosilevac

        We have contact clicks, form submits, scroll 75%, >2min on page, >45s on page, video at 10% and 90%, and site engagement like >5,10,15 pages and >5min, 10min on site.

  • Erez

    For me multiplying by 100 makes it easier to understand as percentage.

    df3$percent10 <- round((df3$10%/ df3$pageviews)*100, digits = 2)
    df3$percent25 <- round((df3$25%/df3$pageviews)*100, digits = 2)
    df3$percent50 <- round((df3$50%/df3$pageviews)*100, digits = 2)
    df3$percent75 <- round((df3$75%/df3$pageviews)*100, digits = 2)
    df3$percent90 <- round((df3$90%/df3$pageviews)*100, digits = 2)
    df3$percent100 <- round((df3$100%/df3$pageviews)*100, digits = 2)

    Thank you for a great article