Fun PresentationFramework side-effect of bad equality

I looked into a pretty fun bug a few days ago. The symptom: a WPF DataGrid populated with an ICollectionView was seeing inconsistent row copy behaviour. You would select a row by clicking on it and press Ctrl+C, but instead of copying the row you selected it would copy the row your cursor was hovering over.

When I hooked up to various clipboard copy events the data looked about the item being copied looked correct - we were copying the correct (selected by clicking) item. But the DataGridClipboardCellContent we were getting was wrong - the Item looked correct, but the Content was populated with the wrong row. At first I thought it was some kind of binding issue, but that turned out to not be the case.

After looking through various DataGrid classes in the MS Refrence Source, I knew things were going awry when getting the cellValue in OnCopyingRowClipboardContent.

At this point I figured it would be easier to just step through the thing in the debugger. Even though the PresentationFramework symbols were being loaded I still couldn’t view the source, so I figured I’d make my own pdb with the help of dotPeek, which did the job. I ended up stepping through DataGridOwner.GetCellClipboardValue, confirming that the item it is trying to get the clipboard value for was indeed the one we expect/want. Eventually we got to TryFindCell in the DataGrid, which attempts to get the row for the item and then tries to get the cell for the relevant column index that we’re looking for:

        internal DataGridCell TryFindCell(object item, DataGridColumn column)
            // Does not de-virtualize cells
            DataGridRow row = (DataGridRow)ItemContainerGenerator.ContainerFromItem(item);
            int columnIndex = _columns.IndexOf(column);
            if ((row != null) && (columnIndex >= 0))
                return row.TryGetCell(columnIndex);

            return null;

Things went wrong at the row finding part. We were getting the wrong row, one that didn’t actually match the item we’re trying to copy. So stepping into ItemContainerGenerator’s ContainerFromItem method I found the culprit:

        /// <summary>
        /// Return the UI element corresponding to the given item.
        /// Returns null if the item does not belong to the item collection,
        /// or if no UI has been generated for it.
        /// </summary>
        public DependencyObject ContainerFromItem(object item)
            object dummy;
            DependencyObject container;
            int index;

                delegate(object o, DependencyObject d) { return Object.Equals(o, item); },
                out dummy, out container, out index, false);

            return container;

We are calling DoLinearSearch in the System.Windows.Controls namespace (I also generated a pdb for System.Windows.dll) with a predicate that checks for equality between the item we are copying and an item container. This method does some of its own stuff to try to make this search less expensive as per the comment:

        ///     There's no avoiding a linear search, which leads to O(n^2) performance
        ///     if someone calls ContainerFromItem or IndexFromContainer for every item.
        ///     To mitigate this, we start each search at _startIndexForUIFromItem, and
        ///     heuristically set this in various places to where we expect the next
        ///     call to occur.

        ///     For example, after a successul search, we set it to the resulting
        ///     index, hoping that the next call will query either the same item or
        ///     the one after it.  And after inserting a new item, we expect a query
        ///     about the new item.  Etc.

DoLinearSearch is called not just when you copy, but also when you perform other actions, like focusing on the UI element. And guess what? Our equality overrides for the class we were copying the row of were broken. They depended on a certain ID property being set and unique, a property that wasn’t even being set and defaulted to 0 for every item. As a result whatever item container DoLinearSearch compared to our desired item up above, the result would always be true and it would pick that first element. I’m surprised the completely messed up equality didn’t bite us in more obvious parts of the code and that it took the depths of PresentationFramework to actually fall over, but the fix in the end was to fix our IDs and equality comparison.

comments powered by Disqus