pydata__xarray-4075

Leaderboard / xarray-4075

pydata/xarray · v0.12 · 7 steps · 39,738 tokens · 322s

resolved

1.00

The task

The bug report the agent was given — plus the gold fix and hidden tests, shown here for reference. The agent saw only the problem statement.

Problem statementgithub issue

[bug] when passing boolean weights to weighted mean
<!-- A short summary of the issue, if appropriate -->


#### MCVE Code Sample
<!-- In order for the maintainers to efficiently understand and prioritize issues, we ask you post a "Minimal, Complete and Verifiable Example" (MCVE): http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports -->

```python
import numpy as np
import xarray as xr

dta = xr.DataArray([1., 1., 1.])
wgt = xr.DataArray(np.array([1, 1, 0], dtype=np.bool))

dta.weighted(wgt).mean()
```
Returns 

```
<xarray.DataArray ()>
array(2.)
```

#### Expected Output
```
<xarray.DataArray ()>
array(1.)
```

#### Problem Description
Passing a b …

Gold patchheld back · the reference fix

diff --git a/xarray/core/weighted.py b/xarray/core/weighted.py--- a/xarray/core/weighted.py+++ b/xarray/core/weighted.py@@ -142,7 +142,14 @@ def _sum_of_weights( # we need to mask data values that are nan; else the weights are wrong mask = da.notnull() - sum_of_weights = self._reduce(mask, self.weights, dim=dim, skipna=False)+ # bool -> int, because ``xr.dot([True, True], [True, True])`` -> True+ # (and not 2); GH4074+ if self.weights.dtype == bool:+ sum_of_weights = self._reduce(+ mask, self.weights.astype(int), dim=dim, skipna=False+ )+ else:+ sum_of_weights = self._reduce(mask, self.weights, dim=dim, skipna=False) # 0-weights are not valid valid_weights = sum_of_weights != 0.0

Hidden testtest_patch · applied only to grade

diff --git a/xarray/tests/test_weighted.py b/xarray/tests/test_weighted.py--- a/xarray/tests/test_weighted.py+++ b/xarray/tests/test_weighted.py@@ -59,6 +59,18 @@ def test_weighted_sum_of_weights_nan(weights, expected): assert_equal(expected, result) +def test_weighted_sum_of_weights_bool():+ # https://github.com/pydata/xarray/issues/4074++ da = DataArray([1, 2])+ weights = DataArray([True, True])+ result = da.weighted(weights).sum_of_weights()++ expected = DataArray(2)++ assert_equal(expected, result)++ @pytest.mark.parametrize("da", ([1.0, 2], [1, np.nan], [np.nan, np.nan])) @pytest.mark.parametrize("factor", [0, 1, 3.14]) @pytest.mark.parametrize("skipna", (True, False))@@ -158,6 +170,17 @@ def test_weighted_mean_nan(weights, expected, skipna): assert_equal(expected, result) +def test_weighted_mean_bool():+ # https://github.com/pydata/xarray/issues/4074+ da = DataArray([1, 1])+ weights = DataArray([True, True])+ expected = DataArray(1)++ result = da.weighted(weights).mean()++ assert_equal(expected, result)++ def expected_weighted(da, weights, dim, skipna, operation): """ Generate expected result using ``*`` and ``sum``. This is checked against

2FAIL → PASS

958PASS → PASS

960 / 960green

The agent run

The agent's step-by-step attempt — its reasoning, the tools it called, and the patch it produced.

System promptharness.agent.build_system_prompt

You are an AI coding assistant working inside a code repository that contains a bug. Investigate the code, find the bug, and fix it.

Your working directory is: /testbed
All four tools (read_file, write_file, edit_file, run_bash) already run from that directory. Use plain relative paths; do not prefix shell commands with cd. Fix the bug by editing source files only; do not create or edit test files.

Step 01 tool use

Let me find the relevant code for the weighted mean computation. The bug is in the `sum_of_weights` calculation where `xr.dot(dta.notnull(), wgt)` is used. When both are boolean, the dot product returns a boolean instead of an integer. Let me find the relevant file.

run_bash{ command: "grep -r \"sum_of_weights\" --include=\"*.py\" /testbed" }

tool result · run_bash

exitcode: 0 stdout: /testbed/xarray/core/weighted.py: def _sum_of_weights( /testbed/xarray/core/weighted.py: sum_of_weights = self._reduce(mask, self.weights, dim=dim, skipna=False) /testbed/xarray/core/weighted.py: valid_weights = sum_of_weights != 0.0 /testbed/xarray/core/weighted.py: return sum_of_weights.where(valid_weights) /testbed/xarray/core/weighted.py: sum_of_weights = self._sum_of_weights(da, dim=dim) /testbed/xarray/core/weighted.py: …

Step 02 tool use

Let me look at the weighted.py file to understand the code:

read_file{ path: "/testbed/xarray/core/weighted.py" }

tool result · read_file

from typing import TYPE_CHECKING, Hashable, Iterable, Optional, Union, overload from .computation import dot from .options import _get_keep_attrs if TYPE_CHECKING: from .dataarray import DataArray, Dataset _WEIGHTED_REDUCE_DOCSTRING_TEMPLATE = """ Reduce this {cls}'s data by a weighted ``{fcn}`` along some dimension(s). Parameters ---------- dim : str or sequence of str, optional Dimension(s) over which to apply the weighted ``{fcn}``. skip …

Step 03 tool use

Now I can see the bug. In `_sum_of_weights`: ```python def _sum_of_weights( self, da: "DataArray", dim: Optional[Union[Hashable, Iterable[Hashable]]] = None ) -> "DataArray": """ Calculate the sum of weights, accounting for missing values """ # we need to mask data values that are nan; else the weights are wrong mask = da.notnull() sum_of_weights = self._reduce(mask, self.weights, dim=dim, skipna=False) # 0-weights are not valid valid_weights = sum_of_weights != 0.0 return sum_of_weights.where(valid_weights) ``` The `mask` is a boolean array (from `da.notnull()`), and `self.weights` is also boolean in the bug case. The `_reduce` method calls `dot(da, weights, dims=dim)`, which computes the dot product of two boolean arrays. The dot product of two boolean arrays returns boolean (since `True * True = True`, `True * False = False`, and sum of booleans stays boolean). So `sum_of_weights` becomes a boolean instead of an integer/float. The fix should ensure that `sum_of_weights` is computed correctly. There are a few places we could fix this: 1. In `_sum_of_weights`, we could cast `mask` to int/float before passing to `_reduce`. 2. In `_reduce`, we could cast both arrays to float before dot. 3. In the `dot` function itself. The most robust fix would be in `_sum_of_weights` - we should ensure the mask (or the result) is numeric. Let me look at what makes sense. The simplest fix: in `_sum_of_weights`, cast the mask to int or float, or cast the result. Let's cast the mask to the same dtype as weights, or just cast the mask to int/float. Actually, looking more carefully: ```python sum_of_weights = self._reduce(mask, self.weights, dim=dim, skipna=False) ``` The issue is that `mask` is bool, `weights` is bool, and dot of bools returns bool. So we should convert the mask to a numeric type. Let's use `mask = da.notnull().astype(int)` or similar. But wait, what if weights are int? Then `weights` are int, `mask` is bool, and dot should work. But what if `weights` are float? Then dot of float and bool should work fine (returns float). The problem only occurs when both are bool. The best fix is probably to convert the mask to a type that works with any weight type. We could do: ```python mask = da.notnull() # ensure the mask is not boolean to avoid incorrect dot product if mask.dtype.kind == 'b': mask = mask.astype(int) ``` But actually, a simpler and more robust approach: just cast the mask to float or int. Or we could cast the result of the dot product. But let me think... Actually, the cleanest fix is to cast `sum_of_weights` to a numeric type right after computing it. But we need to be careful - if weights are float, we should preserve that. The simplest fix: `mask = da.notnull().astype(self.weights.dtype)` - but that might not work if weights are bool. Better: just ensure mask is not boolean. Let me look at what the typical pattern is. I think the cleanest fix is: ```python # we need to mask data values that are nan; else the weights are wrong mask = da.notnull() ``` Change to: ```python # we need to mask data values that are nan; else the weights are wrong mask = da.notnull() # ensure mask is not boolean to avoid incorrect dot product if mask.dtype.kind == 'b': mask = mask.astype(int) ``` Wait, but there's another issue. Even if masks aren't boolean, if weights are boolean, the same problem could happen. Let me check: if mask is bool, weights are bool, dot returns bool. If mask is int, weights are bool, dot returns int? Let me think about numpy behavior: ```python np.dot(np.array([1, 1, 0]), np.array([True, True, False])) # This should return 2 (int) ``` Actually let me verify: `np.dot` of int and bool should promote properly. The issue is specifically when BOTH are bool. So converting just the mask should suffice. But to be extra safe, we could convert the result. Actually, let me just fix it in `_sum_of_weights` by converting the mask. Let me also consider: what if `sel ... [truncated 1,330 characters]

edit_file{ path: "/testbed/xarray/core/weighted.py", old_string: " # we need to mask data values that are nan; else the weights are wrong\n mask = da.notnull()\n\n sum_of_weights = self._reduce(mask, self.wei…", new_string: " # we need to mask data values that are nan; else the weights are wrong\n mask = da.notnull()\n\n # ensure mask is not boolean to avoid incorr…" }

tool result · edit_file

File /testbed/xarray/core/weighted.py edited successfully.

Step 04 tool use

Now let me verify the fix by running the MCVE code from the bug report:

run_bash{ command: "cd /testbed && python -c \"\nimport numpy as np\nimport xarray as xr\n\ndta = xr.DataArray([1., 1., 1.])\nwgt = xr.DataArray(np.array([1, 1, 0], dtype=np.bool))\n\nresu…" }

tool result · run_bash

exitcode: 0 stdout: Result: 1.0 Expected: 1.0 Success! stderr: <string>:6: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

Step 05 tool use

The fix works. Let me also run the existing tests to make sure nothing is broken:

run_bash{ command: "cd /testbed && python -m pytest xarray/tests/test_weighted.py -v 2>&1 | tail -40" }

tool result · run_bash

exitcode: 0 stdout: xarray/core/dask_array_compat.py:16 xarray/core/dask_array_compat.py:16 /testbed/xarray/core/dask_array_compat.py:16: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead. if LooseVersion(dask_version) >= LooseVersion("2.0.0"): xarray/core/dask_array_compat.py:149 xarray/core/dask_array_compat.py:149 /testbed/xarray/core/dask_array_compat.py:149: DeprecationWarning: distutils Version classes are deprecated. …

Step 06 tool use

All 958 tests pass, including 2 expected failures. The fix is correct. Let me also verify with a few more edge cases:

run_bash{ command: "cd /testbed && python -c \"\nimport numpy as np\nimport xarray as xr\n\n# Test 1: boolean weights with float data\ndta = xr.DataArray([1., 1., 1.])\nwgt = xr.DataArray…" }

tool result · run_bash

exitcode: 0 stdout: Test 1 (bool weights, float data): 1.0 (expected 1.0) Test 2 (bool weights, NaN data): 1.0 (expected 1.0) Test 3 (float weights): 1.0 (expected 1.0) Test 4 (int weights): 1.0 (expected 1.0) Test 5 (sum_of_weights, bool weights): 2.0 (expected 2) Test 6 (weighted sum, bool weights): 2.0 (expected 2.0) stderr:

Step 07 reasoning

All edge cases work correctly. The fix is simple and targeted: when the mask (from `da.notnull()`) has boolean dtype, we convert it to integer before passing it to `_reduce` (which calls `dot`). This ensures the dot product of mask and weights produces a proper numeric sum even when weights are also boolean. Note: I should also check if there's a similar issue in `_reduce` for the weighted sum itself. Looking at `_weighted_sum`: ```python def _weighted_sum(self, da, dim=None, skipna=None): return self._reduce(da, self.weights, dim=dim, skipna=skipna) ``` And `_reduce`: ```python @staticmethod def _reduce(da, weights, dim=None, skipna=None): if dim is None: dim = ... if skipna or (skipna is None and da.dtype.kind in "cfO"): da = da.fillna(0.0) return dot(da, weights, dims=dim) ``` If `da` is boolean and `weights` is boolean, `dot` would also return boolean. But in the MCVE, `da` is float, so this isn't an issue. But could someone have boolean data? Let me think... If someone has boolean data with boolean weights, `weighted_sum` would be wrong too. But that's a more exotic case and the bug report doesn't mention it. The fix for the mask addresses the reported issue. Actually, looking at the `_reduce` code more carefully: when `skipna` is True or when `da.dtype.kind in "cfO"` and `skipna` is None, `da.fillna(0.0)` is called, which would convert the data to float. So for float data types, the weighted sum is always computed correctly because `fillna(0.0)` converts to float. But for integer data with `skipna=False`, the dot product could still be problematic if weights are bool. However, the reported bug is specifically about `_sum_of_weights`, which passes `skipna=False` and uses `mask` (a boolean array). My fix addresses this correctly.

The fix is straightforward. The bug is in `_sum_of_weights` in `/testbed/xarray/core/weighted.py`. The mask from `da.notnull()` is boolean, and when `weights` is also boolean, `dot()` returns a boolean result instead of a numeric sum (e.g., `True` instead of `2`). This causes the weighted mean to be computed incorrectly. The fix converts the boolean mask to an integer before passing it to `_reduce`, ensuring the dot product produces proper numeric results regardless of the weights' dtype.

Agent diffcaptured before grading

diff --git a/xarray/core/weighted.py b/xarray/core/weighted.pyindex 996d2e4c..20c58331 100644--- a/xarray/core/weighted.py+++ b/xarray/core/weighted.py@@ -142,6 +142,11 @@ class Weighted: # we need to mask data values that are nan; else the weights are wrong mask = da.notnull() + # ensure mask is not boolean to avoid incorrect dot product+ # when weights are also boolean+ if mask.dtype.kind == "b":+ mask = mask.astype(int)+ sum_of_weights = self._reduce(mask, self.weights, dim=dim, skipna=False) # 0-weights are not valid