Pagination in Power Query (follow up)

This post is an unintended follow up to my “mini-series” working through how I am connecting to Harvest and ClickUp APIs in Power Query. After writing up post #4 (iterating over an unknown number of pages), I decided to review the method I used there vs. the method in post #4 (List.Generate).

So, today’s post is really just describing another example of how to use List.Generate, this time the criteria is comparing where we are relative to the total number of pages returned. It was only after writing up the last post that I realized there is no reason not to approach both queries with the same way so I’m consistent in the underlying code. I find this approach to be much cleaner and easier to read than what I used in post #2, though both return the identical results!

The previous posts in this series were this (although today’s post was not intended to be part of the series!):

  1. Connecting to Harvest in Power Query
  2. Pagination in Power Query (Part 1 – dealing with a known number of pages)
  3. Connecting to ClickUp in Power Query
  4. Pagination in Power Query (Part 2 – iterating over an unknown number of pages)

The original method I used in post #2

Here is what I did originally. I did not use a separate function for this, all the code was in a single query. In a nutshell, I was calling the query initially, which returns page 1 and gives me the number of pages; then I use the number of pages in an array and loop through each page, building a table of results per page.

It works well, there is no “good” reason to change it other than I wanted to improve my own knowledge, and test my understanding of how List.Generate works.

let
    StartDate=Date.ToText(#date(Number.FromText(ReportYear),1,1)),
    EndDate=Date.ToText(#date(Number.FromText(ReportYear),12,31)),
    Source = Json.Document(Web.Contents(HarvestBaseURL,
        [RelativePath="time_entries",
        Query=[access_token=HarvestAccessToken,
               account_id=HarvestAccountID,
               from=StartDate,
               to=EndDate]])),
    Pages = {1..Source[total_pages]},
    PageList = Table.FromList(Pages, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    GetTimePages = Table.AddColumn(PageList, "Custom", each Json.Document(Web.Contents(HarvestBaseURL,
        [RelativePath="time_entries",
        Query=[access_token=HarvestAccessToken,
               account_id=HarvestAccountID,
               from=StartDate,
               to=EndDate,
               page=Text.From([Column1]) & ""]])))
in
    GetTimePages

The result of the initial call is this. The line “total_pages” is what I’m using in the List.Generate criteria further below.

Results of the query showing total_pages = 9 and total_entries = 874 (among other pieces of information)

The previous List.Generate example from post #4

Here is the List.Generate example from post #4 above, where I’m calling a function, and looping through as long as the “count” of the records in the returned list is greater than zero. I don’t know how many pages and it returns a list each time, even if there are no results. The bold text is what I’m changing, the criteria portion of List.Generate.

let 
    Records = List.Generate(()=> 
            [Source = GetTaskPages(0), Page=0],
            each List.Count([Source][tasks]) > 0,
            each [Source = GetTaskPages([Page]+1), Page=[Page] +1],
            each [Source])
in
    Records

Using List.Generate for the “known number of pages” example

The first thing I needed to do (or wanted to do for consistency) is create a function for the calling of the Harvest API time_entries endpoint. That was *in* the query previously (in the example at the start of this post). The raw code is the same, I’m just wrapping it in a function and ensuring the “page=” parameter is included.

(PageNo as number) =>
    let
        StartDate=Date.ToText(#date(Number.FromText(ReportYear),1,1)),
        EndDate=Date.ToText(#date(Number.FromText(ReportYear),12,31)),
        Source = Json.Document(Web.Contents(HarvestBaseURL,
            [RelativePath="time_entries",
            Query=[access_token=HarvestAccessToken,
                account_id=HarvestAccountID,
                from=StartDate,
                to=EndDate,
                page=Number.ToText(PageNo)]]))
    in 
        Source

Next, I copied the List.Generate code and updated it to reflect the new scenario (calling the different function name and changing the criteria.

let
    Records = List.Generate(
        ()=> [Source = GetHarvestTime(1), Page=1],
            each [Page] <= [Source][total_pages],
            each [Source = GetHarvestTime([Page]+1), Page=[Page] +1],
            each [Source])
in
    Records

There were 2 primary changes I needed to make:

  1. Change the function name I was calling.
    • One thing to note, Harvest’s “first page” is page “1”, where ClickUp’s first page is page “0”. So the initial value call here is calling page 1.
  2. Change the criteria to compare the current value in the “Page” parameter to the “total_pages” field from the results of the query to the API. Every result has this field so it’s very easy to identify when you are on the last page.
    • In this case, as long as the current “Page” is less than or equal to the total number of pages, it should keep iterating.

Summary

That’s it! That worked like a charm. It’s cleaner to read (to me at least) so I will keep this version.

A brief side note: I tried to compare the performance of both but the volume of records in my case are so minimal (874 records), the difference is hard to identify. It *seems* like List.Generate is more efficient when using the Diagnostics options in Power Query, but I’m not entirely sure if that is the best indicator.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to top