Efficient Large-Data Display in WPF — DataGrid Tuning and Lazy Data Sources

The built-in System.Windows.Controls.DataGrid is virtualized — only the rows currently on screen are realized into visuals. People hear “virtualized” and assume the control will scale to whatever you throw at it. It doesn’t. There’s a row count past which the grid is fine but everything around it — the ItemsSource materialization, sort, filter, style triggers, export — falls over.

This article is about that gap. We’ll walk through where WPF DataGrid actually stops scaling, what you can recover by tuning, and the lazy-data-source pattern that lets you keep a million-row workload usable.

What “large” means for a WPF DataGrid

Up to about 10,000 rows, the stock DataGrid with virtualization on is usually fine. The user scrolls, rows materialize, scrolling is smooth. You will not feel a problem.

Between 10,000 and 100,000 rows, virtualization still works while scrolling, but the things around scrolling start to hurt: initial bind, sort, filter, AutoGenerateColumns measuring, Style.Triggers on every cell.

Past 100,000 rows the model itself is the problem. Even if every visible row is virtualized, you’ve still materialized 100,000 view-models in memory, run them all through the sort comparer, and held them in a CollectionView. Memory and GC start to dominate.

This article assumes you’ve already established you need to display this many rows — telemetry browsers, transaction blotters, log viewers, lab data. If you only need to paginate a long list, that’s a different problem and DataGrid plus paging works fine.

Where DataGrid stalls in practice

Symptom	What’s actually happening
Initial load freezes the UI for several seconds	Every row in `ItemsSource` is constructed and held in memory before the first paint
Scrolling is jerky	`ScrollUnit="Item"` measures every row, or row containers re-template on each scroll tick
Sort takes 5+ seconds on a column click	`ICollectionView.SortDescriptions` walks every row in the source, not just the visible ones
Filter dropdowns lag	Each opening of the dropdown re-enumerates distinct values across all rows
`AutoGenerateColumns="True"` is slow on first paint	The control measures column widths against a sample of rows that grows with row count
Excel export OOMs	Each row gets a fresh `XLCell` / `ExcelCell` instance; 500k cells × object overhead is a lot

The fixes for the symptoms in the top half of that table are tuning. The fixes for the bottom half — especially export and any “show this big dataset with formulas / merges / multiple sheets” requirement — are architectural, and we’ll get to them.

Option 1: tune the WPF DataGrid

Before adding new dependencies, try the levers the control already gives you. The good news is most defaults are already correct in modern .NET; the failures tend to come from XAML that disabled a default by accident.

Confirm virtualization is on. The defaults are EnableRowVirtualization="True" and EnableColumnVirtualization="True". If you’ve set ScrollViewer.CanContentScroll="False" anywhere up the visual tree — or wrapped the grid in another ScrollViewer — virtualization silently turns off. Check the inspector.

Use item-level scrolling, not pixel-level, for heavy templates.

<DataGrid VirtualizingPanel.ScrollUnit="Item"
          VirtualizingPanel.VirtualizationMode="Recycling"
          EnableRowVirtualization="True"
          EnableColumnVirtualization="True" />

Recycling reuses container instances instead of creating new ones on every scroll — a big win when row templates have triggers, converters, or attached behaviors. Trade-off: smooth pixel-precise scrolling becomes step-by-row. For data grids users skim, this is the right trade.

Don’t auto-size every column. DataGridLength.Auto re-measures across all loaded rows on every change. Pin the widths you know, and use * (star) only on the column the user actually resizes.

Avoid AutoGenerateColumns="True" for production grids. Define columns explicitly. Auto-generation walks the source to infer types and runs reflection per row on first paint.

Load asynchronously. Don’t assign ItemsSource from a synchronous DB call on the UI thread:

// Inside an async event handler:
var rows = await Task.Run(() => _repo.LoadRows());
grid.ItemsSource = rows;          // assignment itself is fast; the cost was the load

For larger sets, switch to incremental loading — push rows in via ObservableCollection<T> in batches, or use a CollectionView that defers refresh:

using (collectionView.DeferRefresh())
{
    foreach (var batch in batches)
        foreach (var row in batch)
            source.Add(row);
}

These are the cheap fixes. They take a 100k-row screen from “freezes for 6 seconds” to “loads instantly and scrolls smoothly.” They do not solve the architectural problems: sort still touches every row, filter still enumerates the whole source, and the moment a user asks for formulas, merged headers, or .xlsx round-tripping, you’re outside what a DataGrid is built for. See Choosing the Right Editable Grid for WinForms and WPF Apps for that decision.

Option 2: a lazy data source

Once you’re past a few hundred thousand rows, the right move is to stop loading everything into the grid’s memory in the first place. The control should ask the source for the rows it needs to draw, and only those.

ReoGrid 4.4 added a built-in LazyLoadDataSource that makes this almost a one-liner. You hand it an object[,] array, attach it in lazy mode, and the worksheet only materializes Cell instances for the rows that are actually rendered or referenced. The array is held by reference — it isn’t copied — so the grid-side memory is roughly “the visible region” rather than “every row times every column.”

using unvell.ReoGrid.Data;

// 1,000,000 rows × 10 columns, already in memory as a plain object[,].
var data = new object[1_000_000, 10];
FillData(data);   // your own population logic

worksheet.SetRows(data.GetLength(0));

// One line. Range is inferred from the source's RecordCount / ColumnCount.
worksheet.AddDataSource(
    new LazyLoadDataSource(data),
    DataSourceLoadMode.LazyLoading);

With one million rows attached, only the ~30 rows currently on screen are ever turned into Cell instances. As the user scrolls, more rows are materialized on demand. Formulas that reference a not-yet-visible cell pull it through the same path, so cross-row references still work.

If you want to point the source at a sub-range instead of the whole sheet, the explicit form is still there:

worksheet.AddDataSource(
    new RangePosition(0, 0, rows, cols),
    new LazyLoadDataSource(data),
    DataSourceLoadMode.LazyLoading);

A few practical notes:

The object[,] is held by reference. Mutating it after attach will be reflected the next time the affected cells are loaded. Convenient for streaming new values into existing rows; surprising if you weren’t expecting it.
To append rows, swap the source. Grow the object[,], call RemoveAllDataSources(), then AddDataSource(new LazyLoadDataSource(newData), …) and SetRows(newRows). The demo in the source tree (DemoJP/Performance/LazyLoadDemo.cs) follows exactly this pattern.
Memory profile. The raw object[,] is still in memory — what you avoid is the grid-side materialization (per-row records, styles, borders, formula state). If even the raw array is too large, drop down to a custom IDataSource<T> whose GetRecord reads from disk or a database — same DataSourceLoadMode.LazyLoading, just a different backing store.

Full walkthrough with formatting and borders: Speeding Up Large Data Loading with Lazy Load Mode.

Option 3: bulk write when you already have the data

If the entire dataset is in memory and you just want to put it on the screen as fast as possible, lazy loading isn’t the right tool — bulk writing is. The cost of sheet["A1"] = x 200,000 times is mostly per-cell event/setup overhead, not the value assignment itself.

SetRangeData takes the whole 2D block at once and writes it in one pass:

var data = new object[rows, cols];
for (int r = 0; r < rows; r++)
    for (int c = 0; c < cols; c++)
        data[r, c] = source[r, c];

sheet.SetRangeData("A1", data);

In ReoGrid 4.4 this is roughly 3× faster than per-cell assignment on a 200,000-cell workload, and it composes cleanly with conditional formatting — which 4.4 also made dramatically faster on bulk writes (see the 4.4 release notes).

Rule of thumb:

Streaming or open-ended source → lazy data source.
Whole dataset in hand at once → SetRangeData.
Both true → load with SetRangeData, leave the grid in normal mode.

What you get for free that DataGrid won’t

Once you’re already in a spreadsheet control because of row count, you incidentally pick up things WPF DataGrid never had:

Frozen rows/columns that survive horizontal scrolling
Cell merging in headers and body
Multiple sheets in the same control instance
Conditional formatting that doesn’t re-run on every scroll tick (4.4 made bulk-write + conditional formatting ~11,700× faster)
Native .xlsx export of the visible data, formulas and formats intact
Formulas between cells, so users can add a =SUM(...) row without you writing code

You don’t have to use these. They’re there if the requirement shows up next quarter.

Exporting the dataset to .xlsx

The “show 500k rows” requirement is almost always followed by “and let me download it as an Excel file.” This is where DataGrid-based stacks fall hardest — the export path usually re-enumerates Items and constructs a fresh cell object per value, which is what eats the memory.

If the data is already in a Worksheet, the export is a single call:

worksheet.Workbook.Save("export.xlsx");

No second pass. No copy into a different library’s object model. The same in-memory representation that’s drawing the grid is what gets written.

For DataTable-shaped sources where the grid is incidental, the comparison of the standalone libraries is in Export a DataTable to Excel (.xlsx) in C# Without Office Interop.

A short decision tree

Fewer than 10k rows, list-shaped, no formulas? → stock WPF DataGrid, defaults are fine.
10k–100k rows, still list-shaped? → WPF DataGrid with the tuning above (recycling, explicit columns, async load).
100k+ rows, or data arrives incrementally? → lazy data source (IDataSource + DataSourceLoadMode.LazyLoading).
Whole dataset in memory, want it on screen instantly? → SetRangeData bulk write.
Any of: formulas, merged headers, multiple sheets, .xlsx round-trip? → you’ve outgrown DataGrid. See Choosing the Right Editable Grid.

The expensive mistake here is the same one as for editable grids in general: keeping the built-in control past the point where it makes sense, and patching the symptoms one at a time. The first three tuning fixes are worth doing. If you’re still hitting walls after them, the problem is the model, not the knobs.

Displaying Large Datasets Efficiently in WPF — When DataGrid Stops Scaling

What “large” means for a WPF DataGrid

Where DataGrid stalls in practice

Option 1: tune the WPF DataGrid

Option 2: a lazy data source

Option 3: bulk write when you already have the data

What you get for free that DataGrid won’t

Exporting the dataset to .xlsx

A short decision tree

Further reading

Try ReoGrid in your own project

Related articles

GrapeCity SPREAD vs ReoGrid — Choosing a .NET Spreadsheet for Light Weight, Speed, and Large Data

When Your Nested IFs Go Three Levels Deep, Switch to IFS — Sorting Out IF / AND / OR / IFERROR

Split Full Names, Parse Codes, Join Columns — Text Manipulation with LEFT, MID, FIND, SUBSTITUTE and TEXTJOIN in C#

Related articles

GrapeCity SPREAD vs ReoGrid — Choosing a .NET Spreadsheet for Light Weight, Speed, and Large Data

GrapeCity SPREAD (now MESCIUS / SPREAD.NET) is a proven, full-featured .NET spreadsheet suite. ReoGrid is a lightweight, fast alternative. This is an honest comparison — SPREAD wins on breadth of features and ecosystem maturity; ReoGrid wins on a single-DLL footprint, rendering speed, large-data handling, and a free-to-start path. Here's how to pick for a given project.
June 19, 2026

When Your Nested IFs Go Three Levels Deep, Switch to IFS — Sorting Out IF / AND / OR / IFERROR

Pass/fail checks, letter grades, achievement-rate rankings — business spreadsheets are full of conditional logic. Everyone's first tool is IF, but once the nesting gets deep the parentheses become unreadable. This guide sorts out IFS for flattening nested IFs, AND / OR for combining conditions, and IFERROR for turning errors into sensible defaults — then shows how ReoGrid (IFS / IFERROR supported since V4.5) runs the same formulas inside a WinForms / WPF app, no Office required.
July 17, 2026

Split Full Names, Parse Codes, Join Columns — Text Manipulation with LEFT, MID, FIND, SUBSTITUTE and TEXTJOIN in C#

Splitting full names into first and last, pulling the domain out of an email address, breaking a product code apart at the hyphens — string cleanup is one line of C# with Split, but if users need to see and fix the results on screen, spreadsheet formulas are the better tool. This guide covers the workhorse recipes built from LEFT, MID, FIND, SUBSTITUTE, and TEXTJOIN, the classic traps (stray spaces, the missing SEARCH function), and how ReoGrid (supported in V4.5) runs the same formulas inside a WinForms / WPF app — no Office required.
July 10, 2026