Pandas integers handling
Python 3.x doesn't have a limit of integer length. You can us as big integers as you like as long as they fit into your machine's memory. But Pandas uses int64 dtype by default which does have limitations.
I had run some experiments (in pandas 0.20.3) and here's what I've found:
I had run some experiments (in pandas 0.20.3) and here's what I've found:
- pd default integer type is int64
- maximum int64 value is 2**63-1
- creating a series with 2**63 results in dtype uint64
- creating a series with 2**63 AND with a negative value results in dtype object
- creating a series with 2**64 results in OverflowError
- creating a series with negative value and 2**64 afterwards results in dtype object
- creating a series with 2**64 and negative value afterwards results in OverflowError
In[2]: import pandas as pd
In[3]: pd.Series([0, 1])
Out[3]:
0 0
1 1
dtype: int64
In[4]: pd.Series([0, 2**63-1])
Out[4]:
0 0
1 9223372036854775807
dtype: int64
In[5]: pd.Series([0, 2**63-1, 2**63])
Out[5]:
0 0
1 9223372036854775807
2 9223372036854775808
dtype: uint64
In[6]: pd.Series([0, 2**63-1, 2**63, -1])
Out[6]:
0 0
1 9223372036854775807
2 9223372036854775808
3 -1
dtype: object
In[7]: pd.Series([0, 2**63-1, 2**63, 2**64])
...
OverflowError: Python int too large to convert to C unsigned long
In[8]: pd.Series([0, 2**63-1, 2**63, 2**64, -1])
...
OverflowError: Python int too large to convert to C unsigned long
In[9]: pd.Series([0, 2**63-1, 2**63, -1, 2**64])
Out[9]:
0 0
1 9223372036854775807
2 9223372036854775808
3 -1
4 18446744073709551616
dtype: object
Comments
Post a Comment