Why can performing a split by position on a column cause a dataflow to load more data than expected?

Study for the Fabric Analytics Engineer Associate Test. Use flashcards and multiple choice questions, each with hints and explanations. Get ready for your exam!

Multiple Choice

Why can performing a split by position on a column cause a dataflow to load more data than expected?

Explanation:
Split by position can’t be folded back to the data source. In Power Query/Dataflows, query folding pushes as many operations as possible down to the source so only the needed rows are retrieved. When you split by position, the transformation isn’t something the source can apply, so the data must be pulled into Power Query first, then the split is performed and the filters are applied afterwards. That means more data is loaded into the flow than if the filtering could have happened at the source. So the reason this causes more data to be loaded is that the data is pulled into Power Query before filtering, rather than filtering at the source.

Split by position can’t be folded back to the data source. In Power Query/Dataflows, query folding pushes as many operations as possible down to the source so only the needed rows are retrieved. When you split by position, the transformation isn’t something the source can apply, so the data must be pulled into Power Query first, then the split is performed and the filters are applied afterwards. That means more data is loaded into the flow than if the filtering could have happened at the source.

So the reason this causes more data to be loaded is that the data is pulled into Power Query before filtering, rather than filtering at the source.

Subscribe

Get the latest from Examzify

You can unsubscribe at any time. Read our privacy policy