Chunk

admin数码26/11/2024 08:16:1100

⚠️ 如果您有光敏性，您可能想跳过此操作。请参阅下面的静态图片，这些灯将开始快速闪烁！

互联网如何运作？

记住标题……我们在这里讨论的是流。

我可以谈论协议、数据包、排序、acks 和 nacks…但我们在这里谈论流，正如你可能猜对了（我相信你 =d）流…它要么是二进制，要么是字符串。

是的，字符串在发送之前会被压缩……但是对于我们在前后端开发中通常关心的内容……字符串和二进制。

在下面的示例中，我将使用 js 流。

虽然 node 有自己的遗留实现，但我们有办法处理相同代码的流，无论是在前面还是后面。

其他语言有自己处理流的方式，但正如你所看到的......处理它的实际代码部分并不复杂（并不是说没有发生复杂的事情）。

示例问题

您的前端必须使用多个来源的数据。

虽然您可以通过其 ip/端口单独访问每个源，但您可以将它们放在 api 网关后面以便于使用和控制。

回购协议

检查链接中的存储库，了解如何自己运行它，以便您可以使用它。

https://github.com/noriller/chunk-busters

视频

后续视频版本：

https://youtu.be/qucaoffi0fm

v0 - 简单的实现

您拥有源，您可以获取、等待和渲染。冲洗并重复。

await fetch1();
handleresult(1);
await fetch2();
handleresult(2);
...
await fetch9();
handleresult(9);

登录后复制

你可能会认为没有人会真正这么做......

在这个例子中，很明显出了问题，但陷入这个问题并不难。

显而易见：它很慢。你必须触发并等待每个请求，如果速度很慢......你必须等待。

v1 - 渴望版本

您知道您不想单独等待每个请求......因此您触发所有请求，然后等待它们完成。

await promise.all([
  fetch1(),
  fetch2(),
  ...
  fetch9(),
]);
handleallresults(results);

登录后复制

这就是你可能会做的事情，所以这很好，对吧？

我的意思是，除非您有一个请求速度很慢……这意味着即使所有其他请求都已完成……您仍然必须等待该请求完成。

v2 - 更智能、更热心的版本

你知道你可能有一些较慢的请求，所以你仍然触发所有请求并等待，但当它们到来时，你已经在可能的情况下对结果做了一些事情，所以当最后一个请求到达时，其他请求已经完成。

await promise.all([
  fetch1().then(handleresult),
  fetch2().then(handleresult),
  ...
  fetch9().then(handleresult),
]);

登录后复制

这一定是最好的解决方案，对吧？

嗯……有什么奇怪的吗？

v3 - 我在骗你……这就是 v1 应该的样子

还记得 v1 吗？是的...它应该是这样的：

事实证明，在 http/1 中同一个端点可以拥有的连接数量是有限制的，不仅如此……它还依赖于浏览器，并且每个浏览器可能有不同的限制。

您可能会认为只使用 http/2 就到此为止了……但即使这是一个很好的解决方案，您仍然需要在前端处理多个端点。

有没有好的解决方案？

v4 - 进入流！

让我们回顾一下 v0，但使用流......

你很聪明，所以你可能已经预料到了这一点，因为警告有点破坏了它……但是是的……你之前看到的并不是后端生成的所有数据。

无论如何......当我们获取时我们就会渲染。

// usually we do this:
await fetch(...).then((res) => {
  // this json call accumulate all the response
  // that later is returned for you to use
  return res.json()
})

登录后复制

如果我们点击即将到来的流，我们就可以对它到来的数据块做一些事情。（是的！就像 chat gpt 之类的一样。）

即使 v0 是处理这个问题最糟糕的方法，但通过使用流可以大大改善它。即使总等待时间相同，您也可以通过显示某些内容来欺骗用户。

再次是 v5 - v1，但是带有流！

http/1 问题仍然是一个问题，但同样，您已经可以看到事情的来龙去脉。

是的…我不能再拖延了…所以…

v6 - 一个 api 来统治它们！

或者……也许我可以？

你看，前端必须管理太多......如果我们可以将其卸载到后端，那么您就可以拥有一个端点来处理所有源。

这解决了前端的复杂性和 http/1 问题。

await fetchall();
handleallresults(results);

登录后复制

// if you had already offloaded to the backend,
// then you might had data like
{
    data0: [...],
    data1: [...],
    ...
    data9: [...],
}
// since the backend would be doing the fetch
// waiting all the data and then sending to the front

// in this case, however...
[
    { from: 'data9', ... },
    { from: 'data1', ... },
    { from: 'data1', ... },
    { from: 'data0', ... },
    ...
    { from: 'data1', ... },
]
// the backend will stream from the sources,
// it will handle, parse, do something with each,
// then send it to the front
// the order is not guaranteed, it will be out of order
// between the sources and you need to add something
// to know which source sent each piece
// but you can receive piece by piece as it's generated
// and handle them accordingly

登录后复制

v7 - 最后......一个 api、多个源和流媒体。

我们调用一个 api，它将调用所有源、流式传输数据、处理数据，并将其传递到前端，前端将依次渲染数据。

用于此的代码正面和背面基本相同：

// in both frontend and backend we can use the same `fetch` api
fetch(url, {
  // we use the AbortController to cancel the request
  // if the user navigates away (or if connection is closed)
  signal,
}).then(async (res) => {
  // instead of the "normal" `res.json()` or `res.text()`
  // we use the body of the response
  // and get the reader from the body
  // this is a `ReadableStream`
  // (there are other types of streams and ways to consume them)
  const reader = res.body?.getReader();
  if (!reader) {
    return;
  }

  // remember we are "low level" here
  // streams can be strings or binary data
  // for this one we know it's a string
  // so we will accumulate it in a string
  let buffer = '';

  // let's read until the stream is done
  while (true) {
    const { done, value } = await reader.read();

    // if the stream is done (or the user aborted)
    if (done || signal.aborted) {
      // cancel the stream, exit the loop
      reader.cancel(signal.reason);
      break;
    }

    // if we have a value
    if (value) {
      // decode and add to the buffer
      buffer = buffer + new TextDecoder().decode(value);

      // here is where the magic happens!
      // we check if we can consume something from the buffer
      if (canConsumeSomething(buffer)) {
        // value above might be half or double of one "chunk"
        // so, we need to slice what we can consume
        // the if above might be changed to a "while"
        // or the chunk below might be "multiple"
        // consumable values... it all depends on how
        // you will use it.
        const [chunk, remaining] = consumeSomething(buffer);

        // in the frontend, make it render the chunk
        // (in react, we can simply update the state
        // and let react handle the rerendering based on that)
        renderChunk(chunk);
        // in the backend... send the response
        // we can always parse, transform before that
        // but the back will probably either send to the
        // frontend, another service or to a DB.
        sendResponse(chunk);

        // and then we update the buffer
        buffer = remaining;
      }
    }
  }
});

登录后复制

是的......就是这样（最基本和最简单的例子）。

我们将字符串添加到缓冲区中，解析它，检查是否有可用的块，使用它，然后忘记它。这意味着您可以接收/消耗 tb 级的数据……一次一大块，而 ram 很少。