The question we’re going to be looking at today is inspired by our course, Query GA4 Data in Google BigQuery.
Why are my Intraday Table timestamps set in the future when I query them?
The intraday table is automatically generated in your Google BigQuery dataset when you enable the streaming export in Google Analytics 4 settings.
The intraday table is named events_intraday_YYYYMMDD
and typically there’s just one of those – timestamped for today.
(If you disable the daily table export option and/or something prevents GA4 from deleting these tables you might see more than one).
The intraday table is near realtime – the latency of data hitting the table vs. when it was recorded in the data source might be as low as a few seconds.
It can thus throw you off when you see that some of the event timestamps are in the future (or too far in the past to make sense).
There is a very simple reason for this and an equally simple solution. In this article we’ll take a look at why this happens and how to solve this in your queries.
Video walkthrough
If you prefer this walkthrough in video format, you can check out the video below.
Don’t forget to subscribe to the Simmer YouTube channel for more content like this.
The intraday table
The table named events_intraday_YYYYMMDD
is the intraday table because it accumulates data throughout the day, as new events are passed through Google Analytics 4.
It’s different from the daily tables (events_YYYYMMDD
) as those are exported once per day in bulk.
As intraday collects the data from “today” in realtime, you first need to agree with Google BigQuery on what “today” refers to. BigQuery can’t guess what timezone you want to query, which is why the default UNIX timestamp format of the event_timestamp
column in BigQuery is always in UTC time.
You’ll get a list of all the hours in your intraday table with this query:
SELECT
EXTRACT(hour
FROM
TIMESTAMP_MICROS(event_timestamp)) AS hour
FROM
`project.dataset.events_intraday_YYYYMMDD`
GROUP BY
hour
ORDER BY
hour
Now, depending on whether your Google Analytics 4 property timezone setting is behind or ahead of UTC, you might see hits either in the future (if you’re behind UTC) or too far in the past (if you’re ahead of UTC, as I am).
You can check your property settings in GA4 Admin -> Property Settings.
In this case, the reporting timezone for my GA4 data is UTC/GMT +2, which means that I’ll consistently see my UTC timestamps reported two hours in the past. Fortunately there’s an easy way to fix this in your SQL queries.
Use timezone to clarify your date/time-related queries
The EXTRACT
function, by default, extracts the part from the UTC timestamp. So in the query example above, the hour of day is set in UTC, which means that it will be off by 2 hours when I compare it with the actual time when GA4 reported the hit.
You can add an extra parameter to the EXTRACT
function: AT TIME ZONE
. The value for this parameter needs to be a recognized timezone name (e.g. "Europe/Helsinki"
) or a UTC offset in hours (e.g. "+02"
).
When you then run the query, the extraction happens relative to the timezone you specified. Here’s a sample query that compares the UTC output and the conversion to my local timezone.
SELECT
EXTRACT(DAY
FROM
TIMESTAMP_MICROS(event_timestamp)) AS day_utc,
EXTRACT(hour
FROM
TIMESTAMP_MICROS(event_timestamp)) AS hour_utc,
EXTRACT(DAY
FROM
TIMESTAMP_MICROS(event_timestamp) AT TIME ZONE "Europe/Helsinki") AS day_local,
EXTRACT(hour
FROM
TIMESTAMP_MICROS(event_timestamp) AT time zone "Europe/Helsinki") AS hour_local
FROM
`simoahava-com.analytics_206575074.events_intraday_20221128`
GROUP BY
1,2,3,4
ORDER BY
4
It’s a simple solution but nevertheless effective.
Note that typically when you do queries directly against the timestamp, using TIMESTAMP_MICROS
, for example, the output is shown in UTC. You can use functions like DATETIME
, which accept an optional timezone parameter if you want to display the timestamp in some other timezone.
Summary
The UNIX timestamp counts the amount of time (typically in milliseconds) that has elapsed since January 1, 1970. It’s a fairly standardized way of aligning events in time. This timestamp is always in UTC time.
The event_timestamp
column in the Google Analytics 4 export to Google BigQuery is a UNIX timestamp in microseconds (accurate to one millionth of a second). When you convert it into a BigQuery timestamp type with TIMESTAMP_MICROS
, the value is displayed and handled in UTC time.
Typically this makes little different, because you’re not necessarily interested in how that timestamp relates to your local time, but when working with realtime (or near realtime) data, the results can seem disorienting.
Luckily SQL has you covered, and you can use timezone conversions with your date and time functions handily. You just need to know what the relevant timezone is!