Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: DatetimeIndex.is_year_start breaks for business-day frequency where n is greater than 1 #58524

Open
3 tasks done
MarcoGorelli opened this issue May 2, 2024 · 4 comments · May be fixed by #58541
Open
3 tasks done
Labels
Bug Frequency DateOffsets

Comments

@MarcoGorelli
Copy link
Member

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

In [68]: dr = pd.date_range("2017-01-01", periods=2, freq="B")

In [69]: dr
Out[69]: DatetimeIndex(['2017-01-02', '2017-01-03'], dtype='datetime64[ns]', freq='B')

In [70]: dr.is_year_start
Out[70]: array([ True, False])

In [71]: dr = pd.date_range("2017-01-01", periods=2, freq="2B")

In [72]: dr
Out[72]: DatetimeIndex(['2017-01-02', '2017-01-04'], dtype='datetime64[ns]', freq='2B')

In [73]: dr.is_year_start
Out[73]: array([False, False])

Issue Description

'2017-01-02' is recognised as year-start in one case, but not in the other

This hacky line is probably to blame

is_business = freqstr[0] == "B"

If anyone submits a fix, you probably want to parse the n and freq components of the offset, rather than messing around with freqstr

Expected Behavior

both return the same result

Installed Versions

INSTALLED VERSIONS

commit : d9cdd2e
python : 3.11.9.final.0
python-bits : 64
OS : Linux
OS-release : 5.15.146.1-microsoft-standard-WSL2
Version : #1 SMP Thu Jan 11 04:09:03 UTC 2024
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : en_US.UTF-8

pandas : 2.2.2
numpy : 1.26.4
pytz : 2024.1
dateutil : 2.9.0.post0
setuptools : 65.5.0
pip : 24.0
Cython : None
pytest : 8.1.1
hypothesis : 6.100.1
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.3
IPython : 8.23.0
pandas_datareader : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.12.3
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : 2024.3.1
gcsfs : None
matplotlib : 3.8.4
numba : 0.59.1
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 15.0.2
pyreadstat : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : 1.13.0
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2024.1
qtpy : None
pyqt5 : None

@MarcoGorelli MarcoGorelli added Bug Needs Triage Issue that has not been reviewed by a pandas team member Frequency DateOffsets and removed Needs Triage Issue that has not been reviewed by a pandas team member labels May 2, 2024
VISWESWARAN1998 added a commit to VISWESWARAN1998/pandas that referenced this issue May 2, 2024
@Nrezhang
Copy link
Contributor

Nrezhang commented May 2, 2024

Hi, is is_year_start supposed to return false for both since they are not on 01/01? or is it supposed to both be True. I guess my question is clarifying what is the exact expected behavior. Is the year start when you have a freq with B the first date that is not Saturday or Sunday?

@Nrezhang Nrezhang linked a pull request May 2, 2024 that will close this issue
5 tasks
@Nrezhang
Copy link
Contributor

Nrezhang commented May 2, 2024

Changed it to check the last character of the string so it handles if there are numbers in front. Works with the example, but not sure about if there are other cases that would break this. The fix seems almost too simple. What do you mean by parse the offset rather than the freqstr?

@VISWESWARAN1998
Copy link
Contributor

Changed it to check the last character of the string so it handles if there are numbers in front. Works with the example, but not sure about if there are other cases that would break this. The fix seems almost too simple. What do you mean by parse the offset rather than the freqstr?

Sometimes freqstr may have values like "10YS-JAN". I have changed like this (VISWESWARAN1998@8df36f7)

But I believe @MarcoGorelli is expecting something else.

@Nrezhang
Copy link
Contributor

Nrezhang commented May 3, 2024

Thank you. I guess we will have to see how @natmokval tackles this issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Frequency DateOffsets
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants