2016-04-21 6 views
0

ich mein Gehirn bin Abstich versucht, mit einem regulären Ausdruck kommen die in der Lage sein wird, die Daten, die ich möchte in diesem SMART Datenausgang ziehen:Regex Crafting SMART-Daten

Offline data collection status: (0x00) Offline data collection activity 
        was never started. 
        Auto Offline Data Collection: Disabled. 
Self-test execution status:  ( 0) The previous self-test routine completed 
        without error or no self-test has ever 
        been run. 
Total time to complete Offline 
data collection:  ( 139) seconds. 
Offline data collection 
capabilities:   (0x73) SMART execute Offline immediate. 
        Auto Offline data collection on/off support. 
        Suspend Offline collection upon new 
        command. 
        No Offline surface scan supported. 
        Self-test supported. 
        Conveyance Self-test supported. 
        Selective Self-test supported. 
SMART capabilities:   (0x0003) Saves SMART data before entering 
        power-saving mode. 
        Supports SMART auto save timer. 
Error logging capability:  (0x01) Error logging supported. 
        General Purpose Logging supported. 
Short self-test routine 
recommended polling time: ( 2) minutes. 
Extended self-test routine 
recommended polling time: (100) minutes. 
Conveyance self-test routine 
recommended polling time: ( 3) minutes. 
SCT capabilities:   (0x1081) SCT Status supported. 

Die regex Ich habe kommen mit so weit ist:

/([^A-Za-z]?:)([\w\s\/().\-]+\.)/gm 

Das Ziel meiner regex ist es, die "Werte" jeder "General SMART Values" von smartctl -a Ausgabe zu erhalten. Das Problem ist, dass die Ausgabe auf eine bestimmte Art und Weise formatiert ist, die es schwierig macht, die gewünschten Werte in ein Array zu ziehen.

Ich bin in der Lage, nur die SMART-Werte Schlüssel wie Offline data collection status oder Self-test execution status zu ziehen, so jetzt arbeite ich an den Werten von jedem dieser Parameter ziehen. Das wäre etwas wie (139) seconds oder (0x00) Offline data collection activity was never started.

Was den Schlüssel von Wert trennt, ist dieser Doppelpunkt gefolgt von einigen Leerzeichen. In einem der Werte befindet sich jedoch Text, der ebenfalls einen Doppelpunkt enthält, was das Capturen extrem schwierig macht. Ich muss alle folgenden Punkte erfassen, ohne versehentlich die nächsten Parameterwerte zu erfassen.

Offline data collection status: (0x00) Offline data collection activity 
        was never started. 
        Auto Offline Data Collection: Disabled. 
Self-test execution status:  ( 0) The previous self-test routine completed 
        without error or no self-test has ever 
        been run. 
Total time to complete Offline 
data collection:  ( 139) seconds. 

Also von den oben genannten muss ich nur Folgendes erfassen.

(0x00) Offline data collection activity 
        was never started. 
        Auto Offline Data Collection: Disabled. 

Ohne in zu gehen und Self-test execution status: als Teil davon erfassen als dass die nächste Parameter Schlüssel.

Jede Hilfe von Gedanken zu dieser Situation wäre hilfreich.

+0

FYI, die Sie verwenden Python Regex in Ihrem regex.com Ausdruck. Ich denke nicht, dass es in diesem Fall einen Unterschied macht, aber Python Regex! = Perl regex. – ThisSuitIsBlackNot

+0

Ich weiß, es ist Python Regex auf dieser Website, ich habe keine Option für Perl zu sehen, aber wie Sie gesagt, wahrscheinlich keinen großen Unterschied hier, da die Regex-Ausdruck universell sein sollte. Wenn Sie eine Website haben, die dasselbe tut, aber für Perl regex anzeigt, lassen Sie es mich bitte wissen. –

+1

@HuyNguyen Ich würde PCRE für Perl verwenden – Laurel

Antwort

2

ich glaube, Sie auf der Tatsache nutzen könnten, dass die Schlüssel am Anfang
der Linie beginnen und der Wert ist immer mindestens eine horizontale Leerzeichen
vor jedem.

(?m)((?:^(?!\s)[^:\n]*\n?)+):(\h+.*?(?:\n|\z)(?:^\h+.*?(?:\n|\z))*)?

Sie Modifikatoren nicht brauchen es enthalten ist.

while ($smartdata =~ /(?m)((?:^(?!\s)[^:\n]*\n?)+):(\h+.*?(?:\n|\z)(?:^\h+.*?(?:\n|\z))*)?/g) 
{ 
    push @key, $1; 
    push @value, $2; 
} 

Expanded

(?m) 
(       # (1 start), Key 
     (?: 
     ^
      (?! \s) 
      [^:\n]* 
      \n? 
    )+ 
)        # (1 end) 
: 
(       # (2 start), Value 
     \h+ .*? 
     (?: \n | \z) 
     (?: 
     ^\h+ .*? 
      (?: \n | \z) 
    )* 
)?       # (2 end) 

Perl Probe

use strict; 
use warnings; 

$/ = undef; 

my $smartdata = <DATA>; 

my @key =(); 
my @val =(); 

while ($smartdata =~ /(?m)((?:^(?!\s)[^:\n]*\n?)+):(\h+.*?(?:\n|\z)(?:^\h+.*?(?:\n|\z))*)?/g) 
{ 
    push @key, $1; 
    if (defined $2) { 
     push @val, $2; 
    } 
    else { 
     push @val, ''; 
    } 
} 

for (0 .. ($#key-1)) 
{ 
    print "key $_ = $key[$_]\n"; 
    print "value = $val[$_]\n-------------------\n"; 
} 

__DATA__ 

Offline data collection status: (0x00) Offline data collection activity 
        was never started. 
        Auto Offline Data Collection: Disabled. 
Self-test execution status:  ( 0) The previous self-test routine completed 
        without error or no self-test has ever 
        been run. 
Total time to complete Offline 
data collection:  ( 139) seconds. 
Offline data collection 
capabilities:   (0x73) SMART execute Offline immediate. 
        Auto Offline data collection on/off support. 
        Suspend Offline collection upon new 
        command. 
        No Offline surface scan supported. 
        Self-test supported. 
        Conveyance Self-test supported. 
        Selective Self-test supported. 
SMART capabilities:   (0x0003) Saves SMART data before entering 
        power-saving mode. 
        Supports SMART auto save timer. 
Error logging capability:  (0x01) Error logging supported. 
        General Purpose Logging supported. 
Short self-test routine 
recommended polling time: ( 2) minutes. 



Extended self-test routine 
recommended polling time: (100) minutes. 
Conveyance self-test routine 
recommended polling time: ( 3) minutes. 
SCT capabilities:   (0x1081) SCT Status supported. 

Ausgabe

key 0 = Offline data collection status 
value = (0x00) Offline data collection activity 
        was never started. 
        Auto Offline Data Collection: Disabled. 

------------------- 
key 1 = Self-test execution status 
value =  ( 0) The previous self-test routine completed 
        without error or no self-test has ever 
        been run. 

------------------- 
key 2 = Total time to complete Offline 
data collection 
value =   ( 139) seconds. 

------------------- 
key 3 = Offline data collection 
capabilities 
value =    (0x73) SMART execute Offline immediate. 
        Auto Offline data collection on/off support. 
        Suspend Offline collection upon new 
        command. 
        No Offline surface scan supported. 
        Self-test supported. 
        Conveyance Self-test supported. 
        Selective Self-test supported. 

------------------- 
key 4 = SMART capabilities 
value =    (0x0003) Saves SMART data before entering 
        power-saving mode. 
        Supports SMART auto save timer. 

------------------- 
key 5 = Error logging capability 
value =   (0x01) Error logging supported. 
        General Purpose Logging supported. 

------------------- 
key 6 = Short self-test routine 
recommended polling time 
value =  ( 2) minutes. 

------------------- 
key 7 = Extended self-test routine 
recommended polling time 
value =  (100) minutes. 

------------------- 
key 8 = Conveyance self-test routine 
recommended polling time 
value =  ( 3) minutes. 

------------------- 
+0

Ich habe es getestet, aber nicht in Perl. Lassen Sie mich einen Testfall laufen und sehen, brb .. – sln

+0

Das funktioniert für mich, es gibt ein paar Schlüssel, die die Worte "empfohlene Polling-Zeit" Ich steckte es in mein Skript und es zieht Ergebnisse schön. –

+0

Scheint zu arbeiten, ich habe ein Beispiel eingefügt. Die Regex wird dem Doppelpunkt im Körper einen Wert geben, das ist kein Problem. – sln

1

Beide Schlüssel und Daten werden über mehrere Zeilen aufgeteilt, so müssen wir beide Fälle behandeln:

use strict; 
use warnings; 

my %data; 

my $lastkey; 

my $prefixkey = ""; 

while (my $smartdata = <DATA>) { 
    chomp $smartdata; 

    if ($smartdata =~ m/^\S/) { 
     if ($smartdata =~ m/^([^:]+):\s+(.*)$/) { # is a complete or end of a key and data 

      $lastkey = $prefixkey ? "$prefixkey $1" : $1; 

      $data{$lastkey} = $2; 

      $prefixkey = ""; 
     } 
     else { # this is the start of a key 
      $smartdata =~ s/(^\s+|\s+$)//; # strip whitespace 
      $prefixkey = $smartdata; 
     } 
    } 
    else { # this is a data continuation 
     $smartdata =~ s/(^\s+|\s+$)//; # strip whitespace 
     $data{$lastkey} .= " $smartdata"; 
    } 
} 

for my $key (keys(%data)) { 
    print("$key:\t$data{$key}\n"); 
} 

__DATA__ 
Offline data collection status: (0x00) Offline data collection activity 
        was never started. 
        Auto Offline Data Collection: Disabled. 
Self-test execution status:  ( 0) The previous self-test routine completed 
        without error or no self-test has ever 
        been run. 
Total time to complete Offline 
data collection:  ( 139) seconds. 
Offline data collection 
capabilities:   (0x73) SMART execute Offline immediate. 
        Auto Offline data collection on/off support. 
        Suspend Offline collection upon new 
        command. 
        No Offline surface scan supported. 
        Self-test supported. 
        Conveyance Self-test supported. 
        Selective Self-test supported. 
SMART capabilities:   (0x0003) Saves SMART data before entering 
        power-saving mode. 
        Supports SMART auto save timer. 
Error logging capability:  (0x01) Error logging supported. 
        General Purpose Logging supported. 
Short self-test routine 
recommended polling time: ( 2) minutes. 
Extended self-test routine 
recommended polling time: (100) minutes. 
Conveyance self-test routine 
recommended polling time: ( 3) minutes. 
SCT capabilities:   (0x1081) SCT Status supported. 

Produziert:

Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. 
Total time to complete Offline data collection: ( 139) seconds. 
SCT capabilities: (0x1081) SCT Status supported. 
Offline data collection capabilities: (0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. 
SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. 
Conveyance self-test routine recommended polling time: ( 3) minutes. 
Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. 
Extended self-test routine recommended polling time: (100) minutes. 
Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. 
Short self-test routine recommended polling time: ( 2) minutes. 
0

Das Format dieser Daten ist nicht die größte, aber zumindest ist es vorhersehbar. Wir können es analysieren, wie der Anfang jeder Zeile aussieht.

use strict; 
use warnings; 
use Data::Dumper; 

my %data; 
my $key; 
my $record; 

while (<DATA>) { 
    chomp; 

    if (s/^\s+/ /g) { 
     $record .= $_; 
    } elsif (s/^([^:]+):\s\s+//) { 
     if (length($record)) { 
      $data{$key} = $record; 
      $key = ''; 
     } 

     $key .= $1; 
     $record = $_; 
    } else { 
     $data{$key} = $record; 
     $key = $_ . ' '; 
     $record = ''; 
    } 
} 

$data{$key} = $record; 
print Dumper(\%data); 

__DATA__ 
Offline data collection status: (0x00) Offline data collection activity 
        was never started. 
        Auto Offline Data Collection: Disabled. 
Self-test execution status:  ( 0) The previous self-test routine completed 
        without error or no self-test has ever 
        been run. 
Total time to complete Offline 
data collection:  ( 139) seconds. 
Offline data collection 
capabilities:   (0x73) SMART execute Offline immediate. 
        Auto Offline data collection on/off support. 
        Suspend Offline collection upon new 
        command. 
        No Offline surface scan supported. 
        Self-test supported. 
        Conveyance Self-test supported. 
        Selective Self-test supported. 
SMART capabilities:   (0x0003) Saves SMART data before entering 
        power-saving mode. 
        Supports SMART auto save timer. 
Error logging capability:  (0x01) Error logging supported. 
        General Purpose Logging supported. 
Short self-test routine 
recommended polling time: ( 2) minutes. 
Extended self-test routine 
recommended polling time: (100) minutes. 
Conveyance self-test routine 
recommended polling time: ( 3) minutes. 
SCT capabilities:   (0x1081) SCT Status supported. 

Ausgang:

$VAR1 = { 
      'Error logging capability' => '(0x01) Error logging supported. General Purpose Logging supported.', 
      'Total time to complete Offline data collection' => '( 139) seconds.', 
      'SCT capabilities' => '(0x1081) SCT Status supported.', 
      'Offline data collection capabilities' => '(0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported.', 
      'SMART capabilities' => '(0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer.', 
      'Conveyance self-test routine recommended polling time' => '( 3) minutes.', 
      'Self-test execution status' => '( 0) The previous self-test routine completed without error or no self-test has ever been run.', 
      'Extended self-test routine recommended polling time' => '(100) minutes.', 
      'Offline data collection status' => '(0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled.', 
      'Short self-test routine recommended polling time' => '( 2) minutes.' 
     };