2017-10-19 1 views
1

Array verwendet, kann nicht null sein mit Avro ExtractorU-SQL: Array kann nicht null sein Avro Extractor

EventHub verwenden und erfassen zu Blob Storage Ich habe eine Funktion auf dem AvroSamples aus, dass die Datei zu verändern versucht.

Das ist mein U-SQL-Skript:

REFERENCE ASSEMBLY [Newtonsoft.Json]; 
REFERENCE ASSEMBLY [log4net]; 
REFERENCE ASSEMBLY [Avro]; 
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats]; 


DECLARE @ABI_DATE string = "2017/10/17/"; //replace by ADF pipeline 
DECLARE @input_file string = "wasb://[email protected]/namespace/eh/{*}/" + @ABI_DATE +"{*}/{*}/{*}"; 
DECLARE @output_file string = @"/output/" + @ABI_DATE + "extract.csv"; 


@rs = 
EXTRACT 
     SequenceNumber long 
     ,EnqueuedTimeUtc string 
     ,Body byte[] 
FROM @input_file 
USING new Microsoft.Analytics.Samples.Formats.ApacheAvro.AvroExtractor(@" 
    { 
     ""type"":""record"", 
     ""name"":""EventData"", 
     ""namespace"":""Microsoft.ServiceBus.Messaging"", 
     ""fields"":[ 
      {""name"":""SequenceNumber"",""type"":""long""}, 
      {""name"":""Offset"",""type"":""string""}, 
      {""name"":""EnqueuedTimeUtc"",""type"":""string""}, 
      {""name"":""SystemProperties"",""type"":{""type"":""map"",""values"":[""long"",""double"",""string"",""bytes""]}}, 
      {""name"":""Properties"",""type"":{""type"":""map"",""values"":[""long"",""double"",""string"",""bytes""]}}, 
      {""name"":""Body"",""type"":[""null"",""bytes""]} 
     ] 
    } 
"); 

@cnt = 
SELECT 
    SequenceNumber 
    ,Encoding.UTF8.GetString(Body) AS Json //THIS LINE BREAKS !!!! 
    ,EnqueuedTimeUtc 
FROM @rs; 

OUTPUT @cnt TO @output_file USING Outputters.Text(); 

Wenn ich den gleichen Extraktor laufen aber der Körper Feld, um es wie erwartet funktioniert Kommentar aus.

Dies ist der Fehler:

Inner exception from user expression: Array cannot be null. Parameter name: bytes Current row dump: SequenceNumber: 4622 EnqueuedTimeUtc: NULL Body: NULL

Error while evaluating expression Encoding.UTF8.GetString(Body)

Antwort

1

Florian Mander, gab mir die Erklärung:

the extractor works correctly, you are just passing null values (intentionally, because it's in the schema) in a method (Encoding.GetString) that doesn't accept null as input. In your latest solution you will lose all the records that don't have a body, though. That's a non technical decision if this is fine or not.

dies so ist der Weg, um es zu beheben (mit einer WHERE-Klausel)

@cnt = 
SELECT 
    SequenceNumber 
    ,Encoding.UTF8.GetString(Body) AS Json 
    ,EnqueuedTimeUtc 
FROM @rs 
WHERE Body != null; 
Verwandte Themen