2017-04-02 2 views
0

Mit Legacy-SQL versuche ich COUNT(DISTINCT field, n) in Google BigQuery zu verwenden. Aber ich bin folgende Fehlermeldung erhalten:BigQuery-Fehler - UNIQUE_HEAP erfordert ein int32-Argument

UNIQUE_HEAP requires an int32 argument which is greater than 0 (error code: invalidQuery)

Hier meine Frage ist, die ich verwendet habe:

SELECT 
    hits.page.pagePath AS Page, 
    COUNT(DISTINCT CONCAT(fullVisitorId, INTEGER(visitId)), 1e6) AS UniquePageviews, 
    COUNT(DISTINCT fullVisitorId, 1e6) as Users 
FROM 
    [xxxxxxxx.ga_sessions_20170101] 
GROUP BY 
    Page 
ORDER BY 
    UniquePageviews DESC 
LIMIT 
    20 

BigQuery ist nicht einmal die Zeilennummer des Fehlers zeigt, also bin ich nicht sicher, welche Linie diesen Fehler verursacht .

What could be possible cause of above error?

Antwort

1

Sie 1e6 in Ihrem COUNT(DISTINCT) verwenden. Verwenden Sie stattdessen einen tatsächlichen INTEGER-Wert für den zweiten Parameter 'N' (Standard ist 1000), oder verwenden Sie stattdessen EXACT_COUNT_DISTINCT().

COUNT(DISTINCT) documentation

EXACT_COUNT_DISTINCT() documentation

If you require greater accuracy from COUNT(DISTINCT), you can specify a second parameter, n, which gives the threshold below which exact results are guaranteed. By default, n is 1000, but if you give a larger n, you will get exact results for COUNT(DISTINCT) up to that value of n. However, giving larger values of n will reduce scalability of this operator and may substantially increase query execution time or cause the query to fail.

To compute the exact number of distinct values, use EXACT_COUNT_DISTINCT. Or, for a more scalable approach, consider using GROUP EACH BY on the relevant field(s) and then applying COUNT(*). The GROUP EACH BY approach is more scalable but might incur a slight up-front performance penalty.

Verwandte Themen