2016-07-21 5 views
0

Ich habe eine Tabelle mit unter anderem die folgenden Spalten:HDFStore: Wählen Sie, wenn Spalte in der Matrix ist

>>> hdf.select('foo').columns 
Out[22]: 
Index(['bar', 'units'], 
     dtype='object') 

Nun wollte ich diejenigen auszuwählen, in dem bar einen von zwei Werten hat:

myBar = ['1500013010', '1500002071'] 
hdf.select('foo', 'bar in [{}]'.format(', '.join(myBar))) 

Aber ich habe diese Ausnahme, die ich impliziert habe ich konnte "bar" nicht als Variable verwenden.

alle variablen refrences muß eine Referenz auf eine Achse sein (zB ‚Index‘ oder ‚Spalten‘) oder ein DATA_COLUMN die aktuell definierten Referenzen sind: index, Spalten

Aber Ist es nicht eine Spalte?

Traceback (most recent call last): 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/io/pytables.py", line 4593, in generate 
    return Expr(where, queryables=q, encoding=self.table.encoding) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/pytables.py", line 516, in __init__ 
    self.terms = self.parse() 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 726, in parse 
    return self._visitor.visit(self.expr) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 310, in visit 
    return visitor(node, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 316, in visit_Module 
    return self.visit(expr, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 310, in visit 
    return visitor(node, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 319, in visit_Expr 
    return self.visit(node.value, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 310, in visit 
    return visitor(node, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 627, in visit_Compare 
    return self.visit(binop) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 310, in visit 
    return visitor(node, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 400, in visit_BinOp 
    op, op_class, left, right = self._possibly_transform_eq_ne(node) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 351, in _possibly_transform_eq_ne 
    left = self.visit(node.left, side='left') 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 310, in visit 
    return visitor(node, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/expr.py", line 413, in visit_Name 
    return self.term_type(node.id, self.env, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/pytables.py", line 38, in __init__ 
    super(Term, self).__init__(name, env, side=side, encoding=encoding) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/ops.py", line 57, in __init__ 
    self._value = self._resolve_name() 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/computation/pytables.py", line 44, in _resolve_name 
    raise NameError('name {0!r} is not defined'.format(self.name)) 
NameError: name 'bar' is not defined 

During handling of the above exception, another exception occurred: 

Traceback (most recent call last): 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/IPython/core/interactiveshell.py", line 2885, in run_code 
    exec(code_obj, self.user_global_ns, self.user_ns) 
    File "<ipython-input-21-75c9827e34f0>", line 1, in <module> 
    hdf.select('foo', 'bar in [{}]'.format(', '.join(bar))) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/io/pytables.py", line 680, in select 
    return it.get_result() 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/io/pytables.py", line 1364, in get_result 
    results = self.func(self.start, self.stop, where) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/io/pytables.py", line 673, in func 
    columns=columns, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/io/pytables.py", line 4021, in read 
    if not self.read_axes(where=where, **kwargs): 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/io/pytables.py", line 3222, in read_axes 
    self.selection = Selection(self, where=where, **kwargs) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/io/pytables.py", line 4580, in __init__ 
    self.terms = self.generate(where) 
    File "/asdf/anaconda/envs/myenv3/lib/python3.5/site-packages/pandas/io/pytables.py", line 4605, in generate 
    .format(where, ','.join(q.keys())) 
ValueError: The passed where expression: bar in [1500013010, 1500002071] 
      contains an invalid variable reference 
      all of the variable refrences must be a reference to 
      an axis (e.g. 'index' or 'columns'), or a data_column 
      The currently defined references are: index,columns 

Antwort

3

Ihre Spalte (n) werden nicht indiziert, daher nicht durchsucht, so dass Sie sie nicht in der where Parameter verwenden.

Demo:

In [131]: df = pd.DataFrame(np.random.randint(0,20,size=(5, 3)), columns=list('ABC')) 

In [132]: df 
Out[132]: 
    A B C 
0 19 4 18 
1 4 14 16 
2 17 13 9 
3 19 9 13 
4 16 8 10 

In [133]: fn = 'C:/temp/test.h5' 

In [134]: store = pd.HDFStore(fn) 

In [135]: store.append('df', df) 

In [136]: store.select('df', 'B > 10') 
--------------------------------------------------------------------------- 
... 
NameError: name 'B' is not defined 

During handling of the above exception, another exception occurred: 
... 
ValueError: The passed where expression: B > 10 
      contains an invalid variable reference 
      all of the variable refrences must be a reference to 
      an axis (e.g. 'index' or 'columns'), or a data_column 
      The currently defined references are: index,columns 

jetzt versuchen wir es mit einem indizierten Spalten:

In [137]: store.append('df_indexed', df, data_columns=True) 

In [139]: store.select('df_indexed', 'B > 10') 
Out[139]: 
    A B C 
1 4 14 16 
2 17 13 9 

Wie kann man überprüfen, ob Spalten indiziert sind oder nicht:

In [154]: store.get_storer('df_indexed').table.colindexes 
Out[154]: 
{ 
    "C": Index(6, medium, shuffle, zlib(1)).is_csi=False, 
    "index": Index(6, medium, shuffle, zlib(1)).is_csi=False, 
    "B": Index(6, medium, shuffle, zlib(1)).is_csi=False, 
    "A": Index(6, medium, shuffle, zlib(1)).is_csi=False} 

In [155]: store.get_storer('df').table.colindexes 
Out[155]: 
{ 
    "index": Index(6, medium, shuffle, zlib(1)).is_csi=False} 
Verwandte Themen