Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExtensionDtype is missing hasobject #26531

Closed
mitar opened this issue May 26, 2019 · 8 comments
Closed

ExtensionDtype is missing hasobject #26531

mitar opened this issue May 26, 2019 · 8 comments
Labels
Closing Candidate May be closeable, needs more eyeballs Enhancement ExtensionArray Extending pandas with custom dtypes or arrays.

Comments

@mitar
Copy link
Contributor

mitar commented May 26, 2019

Problem description

For better compatibility with numpy dtypes, hasobject could be added to ExtensionDtype. We use it to determine if columns contain Python objects or not, but it does not work with sparse column types which are extending ExtensionDtype.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.7.final.0
python-bits: 64
OS: Linux
OS-release: 4.18.0-20-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.25.0.dev0+610.gd2beaf3c8
pytest: None
pip: 18.1
setuptools: 40.7.1
Cython: 0.29.7
numpy: 1.15.4
scipy: 1.2.0
pyarrow: 0.13.0
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.5
pytz: 2018.9
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@TomAugspurger
Copy link
Contributor

Is this different from .kind being ‘O’?

@mitar
Copy link
Contributor Author

mitar commented May 26, 2019

To my understanding the difference is that hasobject also returns True if fields in structs have objects. So you could have a C struct type with fields pointing to objects. That would not be O by itself, but it still hasobject.

@jorisvandenbossche jorisvandenbossche added Dtype Conversions Unexpected or buggy dtype conversions ExtensionArray Extending pandas with custom dtypes or arrays. labels May 27, 2019
@jorisvandenbossche
Copy link
Member

One question is: what should this return for extension arrays/dtypes that store data "natively" but box into python objects (eg when converting to a numpy array) ?
Because that might depend on the reason that you are checking hasobject

@mitar
Copy link
Contributor Author

mitar commented May 27, 2019

For us, the reason why we check hasobject is to know if we have to recurse the object. So to know if the value is a scalar (final) value or is it something we have to recurse when we are searching for all scalar values.

I think that probably for our own use the kind == 'O' might even be enough, if we see struct types as scalar values.

@jorisvandenbossche
Copy link
Member

How does object dtype signal that the value is not scalar and needs to be further recursed?
Eg an object array of strings, the values are also "scalars" ? Or not for your use case?

@mitar
Copy link
Contributor Author

mitar commented May 27, 2019

They are. Sadly, that is a false positive. Ideally, Python strings would have their own dtype and that would make our life much easier. So we recurse and then discover it is a string.

@mroeschke mroeschke added Enhancement and removed Dtype Conversions Unexpected or buggy dtype conversions labels Jul 10, 2021
@jbrockmendel
Copy link
Member

Is this different from .kind being ‘O’?

FWIW PeriodDtype.kind is "O".

My gut here is that hasobject is little-used and adding it is more likely to introduce problems than fix them.

@jbrockmendel jbrockmendel added the Closing Candidate May be closeable, needs more eyeballs label Oct 3, 2023
@mroeschke
Copy link
Member

Looks like there not much support to add this attribute from the core team so closing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closing Candidate May be closeable, needs more eyeballs Enhancement ExtensionArray Extending pandas with custom dtypes or arrays.
Projects
None yet
Development

No branches or pull requests

5 participants