AWS SDK pagination – disregard for Developer Experience

According to Pagination using Async Iterators in modular AWS SDK for JavaScript, state of the art for pagination in AWS JS SDK is:

const client = new DynamoDBClient({}); 
const tableNames = [];
for await (const page of paginateListTables({ client }, {})) {
    // page contains a single paginated output.
    tableNames.push(...page.TableNames);
}

While having paginators is an improvement over previous atrocities like making the client code track the pagination state (through variously named next page fields of course), it’s still wrong.

Wouldn’t it make way more sense to have SDK where the developer doesn’t need to deal with pagination at all? Like this:

const client = new DynamoDBClient({}); 
const tableNames = [];
for await (const tableName of client.listTablesV2()) {
    tableNames.push(tableName);
}

… where pagination is hidden behind hypothetical listTablesV2(). Note that collecting into tableNames probably wouldn’t be necessary anymore, just do what you need with client.listTablesV2().

I’m thrilled to hear any justification for bothering developers with implementation detail such as pagination. Let me know.

If your justification is “… but that’s how API works” then you made a mental shortcut here. Why SDK should be one to one with API? Even AWS themselves think it shouldn’t. They even added the paginators abstraction. It’s just the wrong one.

If your justification is “there will be cases where you will want explicit control over pagination” – while it’s hard for me to imagine such use case, the solution would be to provide the low level API in addition.

If your justification involves how we got here and/or why it was easier for AWS to implement pagination this way – that’s not the justification I’m looking for. The answer to “daddy, what’s your job?” shouldn’t be “mostly working around AWS”.

Edit 2023-10-28: Clarification. In the proposed SDK, the pages should be fetched on demand so cost of API calls is not an argument for explicit pagination.

Nitpick: should probably be called tablesNames, not tableNames. These are names for different tables, not multiple names for the same table.

2023-10-30 Update (WIP, I’ll be editing)

The input for the update is the Reddit discussion.

The Reddit Discussion

The responses ranged from “I like the pagination” to “I think itโ€™s a fair point – perhaps in the evolution of the sdks, nobody raised your exact question (which I think is valid) …”

Chi Ma (u/chigia001) had an insightful perspective. I assume he spent quite some time thinking about pagination as he is a maintainer of an SDK that also has pagination.

Clarification

The original post doesn’t mention but the thought about the current low level API was: “it might need to be exposed too for some reason which is not currently apparent to me”.

What I Missed

I didn’t pay much attention that we are talking about AsyncIter, not the finest part of JavaScript particularly convenient feature of JavaScript. ( to be continued )

My Opinion – Unchanged Part

The basic premise of this blog post stays the same. My code should not have to deal with pagination when I don’t want to. It’s just unergonomic boilerplate. Same way I usually don’t care about lower level things such as TCP, TLS, and serialization, I don’t want to care about how the items I need are batched.

I propose to

  • Focus on finding better solutions rather than on justifying what we have.
  • Use dissatisfaction as a driver for progress.

Update 2023-11-25: boto3 (AWS SDK for Python) does have a way to iterate over collection of items where pagination is handled behind the scenes. From my perspective, it proves the basic premise.

My Opinion – Changed Parts

From 99% sure that I have obvious and straightforward solution (listTablesV2 above) it changed to

  • It’s a meh solution
    • AsyncIter is not very friendly and according to Chi Ma, not very well known.
  • There are other solutions
  • The low level API with pagination should be exposed
  • It’s a tradeoff
  • Exposing several similar APIs might be confusing and presents a challenge for SDK developers.
  • I don’t see a good solution. Maybe custom lazy list type. More thought should be invested in finding a better solution.

( to be continued )


Have a nice and productive day … to the extent possible of course ๐Ÿ™‚