2016-08-02 14 views
0

Ich muss die Wahrscheinlichkeit eines Bigramms berechnen, wenn die entsprechenden Unigramme in einer Liste vorhanden sind. Das gewünschte Ergebnis ist beispielsweise in der folgenden Liste, 'pretty girl', 'pretty', 'girl', alle vorhanden. Die Wahrscheinlichkeit ist daher durch die Werte in der Liste mit P, (0.0017) % (0.003 * 0.002) = 5.999999999999987e-06Wählen Sie spezielle Elemente in einer Liste aus und berechnen Sie bedingte Wahrscheinlichkeiten

S = ['girl', 'pretty', 'pretty girl', 'our', 'our world', 'wide', 'word', 'yes', 'yike', 'yummy'] 

P = [0.003, 0.002, 0.0017, 0.003, 0.006, 0.004, 0.002, 0.012, 0.006, 0.003] 

Ich habe den folgenden Code. Es scheint mir nicht die Ergebnisse zu liefern und deshalb kann ich nicht weitermachen, um die Wahrscheinlichkeiten zu berechnen. Was ich mit diesem Code versuche, ist, Bigramme in der Liste auszuwählen und ihre entsprechenden Unigramme zu finden. Dann plane ich, ihre Wahrscheinlichkeiten in P zu vergleichen.

In [60]: import re 
In [61]: M = [] 
In [62]: for i in range(len(S)): 
      s_split = S[i].split() 
      s_split_len = len(S[i].split()) 
      if s_split_len == 2: 
       m = [] 
       a = re.match(s_split[0], S[i]) 
       b = re.match(s_split[1], S[i]) 
       m.append(a) 
       m.append(b) 
       M.append(m) 
       print M 

[[<_sre.SRE_Match object at 0x10447b988>, None], [<_sre.SRE_Match object at 0x10447b8b8>, None], [<_sre.SRE_Match object at 0x10447b920>, None], [<_sre.SRE_Match object at 0x10447b9f0>, None], [<_sre.SRE_Match object at 0x10447bac0>, None], [<_sre.SRE_Match object at 0x10447bb90>, None], [<_sre.SRE_Match object at 0x10447bbf8>, None], [<_sre.SRE_Match object at 0x10447bc60>, None], [<_sre.SRE_Match object at 0x10447bcc8>, None], [<_sre.SRE_Match object at 0x10447bd30>, None], [<_sre.SRE_Match object at 0x10447bd98>, None], [<_sre.SRE_Match object at 0x10447be00>, None], [<_sre.SRE_Match object at 0x10447be68>, None], [<_sre.SRE_Match object at 0x10447bed0>, None], [<_sre.SRE_Match object at 0x10447bf38>, None], [<_sre.SRE_Match object at 0x1044a8030>, None], [<_sre.SRE_Match object at 0x1044a8098>, None], [<_sre.SRE_Match object at 0x1044a8100>, None], [<_sre.SRE_Match object at 0x1044a8168>, None], [<_sre.SRE_Match object at 0x1044a81d0>, None], [<_sre.SRE_Match object at 0x1044a8238>, None], [<_sre.SRE_Match object at 0x1044a82a0>, None], [<_sre.SRE_Match object at 0x1044a8308>, None], [<_sre.SRE_Match object at 0x1044a8370>, None], [<_sre.SRE_Match object at 0x1044a83d8>, None], [<_sre.SRE_Match object at 0x1044a8440>, None], [<_sre.SRE_Match object at 0x1044a84a8>, None], [<_sre.SRE_Match object at 0x1044a8510>, None], [<_sre.SRE_Match object at 0x1044a8578>, None], [<_sre.SRE_Match object at 0x1044a85e0>, None], [<_sre.SRE_Match object at 0x1044a8648>, None], [<_sre.SRE_Match object at 0x1044a86b0>, None], [<_sre.SRE_Match object at 0x1044a8718>, None]] 
[[<_sre.SRE_Match object at 0x10447b988>, None], [<_sre.SRE_Match object at 0x10447b8b8>, None], [<_sre.SRE_Match object at 0x10447b920>, None], [<_sre.SRE_Match object at 0x10447b9f0>, None], [<_sre.SRE_Match object at 0x10447bac0>, None], [<_sre.SRE_Match object at 0x10447bb90>, None], [<_sre.SRE_Match object at 0x10447bbf8>, None], [<_sre.SRE_Match object at 0x10447bc60>, None], [<_sre.SRE_Match object at 0x10447bcc8>, None], [<_sre.SRE_Match object at 0x10447bd30>, None], [<_sre.SRE_Match object at 0x10447bd98>, None], [<_sre.SRE_Match object at 0x10447be00>, None], [<_sre.SRE_Match object at 0x10447be68>, None], [<_sre.SRE_Match object at 0x10447bed0>, None], [<_sre.SRE_Match object at 0x10447bf38>, None], [<_sre.SRE_Match object at 0x1044a8030>, None], [<_sre.SRE_Match object at 0x1044a8098>, None], [<_sre.SRE_Match object at 0x1044a8100>, None], [<_sre.SRE_Match object at 0x1044a8168>, None], [<_sre.SRE_Match object at 0x1044a81d0>, None], [<_sre.SRE_Match object at 0x1044a8238>, None], [<_sre.SRE_Match object at 0x1044a82a0>, None], [<_sre.SRE_Match object at 0x1044a8308>, None], [<_sre.SRE_Match object at 0x1044a8370>, None], [<_sre.SRE_Match object at 0x1044a83d8>, None], [<_sre.SRE_Match object at 0x1044a8440>, None], [<_sre.SRE_Match object at 0x1044a84a8>, None], [<_sre.SRE_Match object at 0x1044a8510>, None], [<_sre.SRE_Match object at 0x1044a8578>, None], [<_sre.SRE_Match object at 0x1044a85e0>, None], [<_sre.SRE_Match object at 0x1044a8648>, None], [<_sre.SRE_Match object at 0x1044a86b0>, None], [<_sre.SRE_Match object at 0x1044a8718>, None], [<_sre.SRE_Match object at 0x1044a8780>, None]] 
+0

Dank @Chris_Rands. Das Beispiel ist in meiner Beschreibung angegeben (die ersten vier Zeilen des Posts). Die Beispieldaten sind die Listen S und P. Die Ausgabe des Codes sind die Listen der Objekte im letzten Teil des Posts. – achimneyswallow

Antwort

0

Dies funktioniert

S = ['girl', 'pretty', 'pretty girl', 'our', 'our world', 'wide', 'world', 'yes', 'yike', 'yummy'] 

P = [0.003, 0.002, 0.0017, 0.003, 0.006, 0.004, 0.002, 0.012, 0.006, 0.003] 


for i in range(len(S)): 
    s_split = S[i].split() 
    s_split_len = len(S[i].split()) 
    if s_split_len == 2: 
     a = S.index(S[i]) 
     b = S.index(S[i].split()[0]) 
     c = S.index(S[i].split()[1]) 
     if a != None: 
      if b != None: 
       co = [a, b, c] 
       probs = P[co[0]], P[co[1]], P[co[2]] 
       print S[i], probs[0] % (probs[1] * probs[2]) 
Verwandte Themen