Second Week with GA4
With the kick-off of my second week with Google Analytics, I started to dive more into the fine-tuning of GA4 and Big Query.
As a general outline, I have planned out my course to look like I complete one course a week. Since there is a grand total of 15 courses to be completed in 12 weeks, I’ll complete 1 per week for the first three weeks and then pump up those numbers in the upcoming weeks to match the excess 3 lessons I would have to cover up. Obviously, I’m keeping in mind the standard deviation of course length from the mean course length, since that is bound to vary vastly. I have had a pretty easy-going first course, but I know for a fact that I would have to grapple with the larger and more academically rigorous courses on the fly.
Also, as a snippet of extra information, I have also started to experiment with three websites now; the CXL demo account site, the Weebly site I had set up as a course requirement, and also the very blog I have set up this post in. This will help me expand on what I have learned to see what data I can collect, and how I can use said data to reach wider audiences.
Now, back to the course fundamentals! Everything kicked off with my lesson on Admin Overview wherein buzzwords such as ‘Streams’ started to make more sense. Basically, they are called so because they act as 'streams of data', having information for various individual aspects of tracking that GA4 (or GTM) has been doing. The information relayed can then be used further segmented and queried to give us ‘funneled answers. Building on this, the data given can then be showed in a visual format, thus making it easier to pinpoint exactly where we can alter our code or fix the UI to not only attract more audience but to retain the numbers (Retention Rate) we already have and convert them (Conversion Rate, further broken into Revenue per Customer) into fluid revenue (see Essay 1).
Cross-Domain Tracking was then introduced, wherein two websites could be tracked in the same property, simply by being part of different input streams. Further, tracking internal traffic was also an interesting aspect pointed out, wherein I could identify the devices of my own team simply by putting in their IP addresses.
Further, Events was another topic that added new phrases into this newfound language of GA4. Here, the prospect of events was broken down into two basic streams; Automatically Collected Events, wherein basic data collection happened automatically, and Enhanced Measurement, where data is collected automatically if the user has enabled enhanced measurement. In both of these functionalities, the bottom line comes to how much automation is required which is specific. If data collection which is offered by default by Google is sufficient, there is no need for adding further data streams and events. On the other hand, if there is a high degree of precision required which is not given by default GA4 settings, the need arises to set up custom events. Citing an example from the course itself, the event of tracking scroll percentage is a default set by GA4 and gives us statistics for users who scrolled past a minimum threshold of 90%. However, if we require data for the percentage of users who scrolled past 25, 50 and 75, we need to create custom events with custom scrolling and put in the respective percentages required which will now be reported in Realtime.
Talking in the fundamental building block language, events can basically be visualized as ways/roads you can pursue over a highway (streams) in order to find data. Here is additional information I found about custom events, given at the end of the lesson
Moving onto the nitty-gritty of data streams. It fundamentally represents the flow of data from a customer touchpoint (e.g., app, website) to Analytics. When you create a data stream, Analytics generates a snippet of code that you add to your app or site to collect that data. Data is collected from the time you add the code, and that data forms the basis of your reports. GA4 now allows us to collect reports and data from both apps and the web into one consolidated report, hence reducing data redundancy exponentially.
It is now that we understand Data Streams and Events that Funnels become ever important. We use data filters to include or exclude event data from your reports based on event-parameter values. Data filters are configured at the property level and are applied to all incoming data. They are evaluated from the point of creation forward and do not affect historical data. Using data filters properties, we can easily set some filters to be used for testing without adding any dummy streams to be used for information, which essentially saves us a lot of hassle. Furthermore, data that matches the given filter can either be excluded or included in the final product, depending on what the user wants. So, testing becomes a seamless experience.
Then, further plug-ins of supplementary Google apps were introduced, wherein we could produce a much better visual representation of data using Google Signals and Ad personalization could be altered geographically using Google AdSense.
Then, an important aspect of data storage was brought to life, where data that is introduced into GA4 is stored for either 2 months or 14 months, depending on user settings. By default, the data retention is set to 2 months by Google. In hindsight, it would help them save more server space and create more channels for new users. However, new users not aware of this would be in for a surprise, since no Funneling or querying of the data can be done post that period.
As explained earlier, internal traffic can be monitored by using IP addresses. Though that is useful, it will be a miniscule amount of the actual traffic that visits a well-set-up website. Therefore, GA4 uses Default Reporting Identity which is based on Google’s Client ID (which I figure out to be the Google Accounts we form. Yes, Google can market that off too.) Within the prospect of GA4, we can identify users by either their Client ID or name them as Anonymous users. However, the previously mentioned Default Reporting Identity (DRI henceforth) uses a combination of Google ID and Device ID (IP Address) which is a neat way to not only measure the sheer volume of users but also how that volume interacts with the site since the device IDs can show exactly how many devices for an individual user are plugged into the website at any point of time. Additionally, as a backup plan, we can also use Google Signals to identify users and log them into our databases.
That’s it for this week! I will continue these strings every week as usual. I will be linking my website below, make sure you check it out. Till then, live long and prosper.
Here is the link to the site I am learning everything from!
https://cxl.com/
Here’s my site’s sub-domain, check it out!
https://datalytics101.weebly.com/

Comments
Post a Comment