Discussion:
ATS QUIC Hackathon Result
Masakazu Kitajo
2018-11-19 01:12:57 UTC
Permalink
Hi all,

Here is a result of ATS QUIC Hackathon at Tokyo.

We couldn't make many code changes, but we were able to find out a cause of
performance issue. This is a big win, which we couldn't make without this
hackathon.

## Input
Possible reasons of low performance:
- Send / receive packets one by one with sendmsg and recvmsg
- WRITE_READY event is scheduled like a spin
- Multiple copy happen during sending data

### Why WRITE_READY is scheduled by NetVC itself?
Because there is only one FD
-> Delayed bind should resolve the WRITE_READY spin issue

### Why not sendmmsg?
A concern raised on the last hackathon was that sendmmsg works like all or
nothing
-> sendmsg returns the number of packets sent, so doesn't need to re-send
all the packets
-> Use sendmmsg as well as recvmmsg

### Use ET_SSL for encryption?
Encryption might block the thread
-> ET_NET does it for HTTPS, so it would not be a problem
-> We should increase a number of ET_NET if it really matters

### Less copy with IOBufferBlock?
It was the consensus of the last hackathon but many things changed
-> We should stick with the plan if possible

## Discussion

### What should we work on?
- Performance issue has priority
- H3 and QPACK have priority too but too big for Hackathon timeframe

## Hack

### Performance issue
Found that WRITE_READY_INTERVAL is one of causes.
-> connect + bind should remove this interval
We still see ~30ms delays before sending packets in the queue.

### Memory allocator
Since we have many classes for QUIC, we would need a lot of allocators.
Use of jemalloc would be an option but not sure if we really want it.
Use of class allocators for QUIC frames look unnecessary (#4621).

### Generate random token (#4494)
PR #4598 was created.

### sendmmsg / recvmmsg
sendmsg_x and recvmsg_x seem like not public system calls.
Need to find alternatives.

### Ports for QUIC (#4410)
Not exactly for the QUIC issue but PR #4600 was created.

### H2 crash issue (#4504)
The mutex for H2ConnectionState is released too early.
PR #4504 was created.

## Other
According to the minutes of IETF103, RFC won't be published until July.
We can still aim 9.0 release if there are no big changes on the specs, but
9.1 is more plausible.

Thanks,
Masakazu
宋辰伟
2018-11-19 08:30:11 UTC
Permalink
Hi maskit and koshiba

What does 30ms delay means ? How to reproduce it ?

SCW00
seem
Masakazu Kitajo
2018-11-20 01:13:37 UTC
Permalink
I have no idea where the delay comes from. I posted its detail on the PR (
https://github.com/apache/trafficserver/issues/3552#issuecomment-440099449).

Thanks,
Masakazu
Post by 宋辰伟
Hi maskit and koshiba
What does 30ms delay means ? How to reproduce it ?
SCW00
seem
Walt Karas
2018-11-20 01:21:56 UTC
Permalink
What strategy or strategies do we use in ATS to make sure that we
don't do blocking I/O that blocks a thread with queued event handlers
(not dependent on the I/O operation)?
Post by Masakazu Kitajo
I have no idea where the delay comes from. I posted its detail on the PR (
https://github.com/apache/trafficserver/issues/3552#issuecomment-440099449).
Thanks,
Masakazu
Post by 宋辰伟
Hi maskit and koshiba
What does 30ms delay means ? How to reproduce it ?
SCW00
seem
Leif Hedstrom
2018-11-20 01:30:49 UTC
Permalink
All network IO is non blocking, disk IO is scheduled on dedicated IO threads (scales by number of disks). The latter is essentially our own AIO, but we also support Linux native AIO.

— Leif
Post by Walt Karas
What strategy or strategies do we use in ATS to make sure that we
don't do blocking I/O that blocks a thread with queued event handlers
(not dependent on the I/O operation)?
Post by Masakazu Kitajo
I have no idea where the delay comes from. I posted its detail on the PR (
https://github.com/apache/trafficserver/issues/3552#issuecomment-440099449).
Thanks,
Masakazu
Post by 宋辰伟
Hi maskit and koshiba
What does 30ms delay means ? How to reproduce it ?
SCW00
seem
宋辰伟
2018-11-20 01:47:30 UTC
Permalink
Hi

Actually there is some blocking operation during ats like epoll. We are blocking at epoll for several times and until the net event arrived. So if any other event (which is not net event) schedule it will be pended until epoll’s timeout fired unless you use the signal to terminate blocking.

Scw00
Post by Walt Karas
What strategy or strategies do we use in ATS to make sure that we
don't do blocking I/O that blocks a thread with queued event handlers
(not dependent on the I/O operation)?
Post by Masakazu Kitajo
I have no idea where the delay comes from. I posted its detail on the PR (
https://github.com/apache/trafficserver/issues/3552#issuecomment-440099449).
Thanks,
Masakazu
Post by 宋辰伟
Hi maskit and koshiba
What does 30ms delay means ? How to reproduce it ?
SCW00
seem
Leif Hedstrom
2018-11-20 02:10:50 UTC
Permalink
Post by 宋辰伟
Hi
Actually there is some blocking operation during ats like epoll. We are blocking at epoll for several times and until the net event arrived. So if any other event (which is not net event) schedule it will be pended until epoll’s timeout fired unless you use the signal to terminate blocking.
Yeh sure, but in theory at least if there are active connections epoll would almost always have events :).

But yes, things not on the epoll event loop could get stuck on low activity boxes. Speaking of, there might still be improvements to do here with timerfd and eventfd. We did soMe work there a few years ago.

Cheers,

— Leif
Post by 宋辰伟
Scw00
Post by Walt Karas
What strategy or strategies do we use in ATS to make sure that we
don't do blocking I/O that blocks a thread with queued event handlers
(not dependent on the I/O operation)?
Post by Masakazu Kitajo
I have no idea where the delay comes from. I posted its detail on the PR (
https://github.com/apache/trafficserver/issues/3552#issuecomment-440099449).
Thanks,
Masakazu
Post by 宋辰伟
Hi maskit and koshiba
What does 30ms delay means ? How to reproduce it ?
SCW00
seem
宋辰伟
2018-11-20 02:14:01 UTC
Permalink
Currently QUIC event doesn’t use the epoll in ET_NET, SO it is a really a problem.
Post by Leif Hedstrom
Post by 宋辰伟
Hi
Actually there is some blocking operation during ats like epoll. We are blocking at epoll for several times and until the net event arrived. So if any other event (which is not net event) schedule it will be pended until epoll’s timeout fired unless you use the signal to terminate blocking.
Yeh sure, but in theory at least if there are active connections epoll would almost always have events :).
But yes, things not on the epoll event loop could get stuck on low activity boxes. Speaking of, there might still be improvements to do here with timerfd and eventfd. We did soMe work there a few years ago.
Cheers,
— Leif
Post by 宋辰伟
Scw00
Post by Walt Karas
What strategy or strategies do we use in ATS to make sure that we
don't do blocking I/O that blocks a thread with queued event handlers
(not dependent on the I/O operation)?
Post by Masakazu Kitajo
I have no idea where the delay comes from. I posted its detail on the PR (
https://github.com/apache/trafficserver/issues/3552#issuecomment-440099449).
Thanks,
Masakazu
Post by 宋辰伟
Hi maskit and koshiba
What does 30ms delay means ? How to reproduce it ?
SCW00
seem
Leif Hedstrom
2018-11-20 02:17:25 UTC
Permalink
Post by 宋辰伟
Currently QUIC event doesn’t use the epoll in ET_NET, SO it is a really a problem.
Ah. Yes, part of our planning session included moving all this to ET_NET. That is part of using connect+bind on the QUIC sessions.

Cheers,

— Leif
Post by 宋辰伟
Post by Leif Hedstrom
Post by 宋辰伟
Hi
Actually there is some blocking operation during ats like epoll. We are blocking at epoll for several times and until the net event arrived. So if any other event (which is not net event) schedule it will be pended until epoll’s timeout fired unless you use the signal to terminate blocking.
Yeh sure, but in theory at least if there are active connections epoll would almost always have events :).
But yes, things not on the epoll event loop could get stuck on low activity boxes. Speaking of, there might still be improvements to do here with timerfd and eventfd. We did soMe work there a few years ago.
Cheers,
— Leif
Post by 宋辰伟
Scw00
Post by Walt Karas
What strategy or strategies do we use in ATS to make sure that we
don't do blocking I/O that blocks a thread with queued event handlers
(not dependent on the I/O operation)?
Post by Masakazu Kitajo
I have no idea where the delay comes from. I posted its detail on the PR (
https://github.com/apache/trafficserver/issues/3552#issuecomment-440099449).
Thanks,
Masakazu
Post by 宋辰伟
Hi maskit and koshiba
What does 30ms delay means ? How to reproduce it ?
SCW00
seem
宋辰伟
2018-11-20 01:51:44 UTC
Permalink
Hi masaori

Sorry for the wrong name.

Hi maskit and masaori

Fill an issue try to figure out the receive side delays .

scw00
Post by Walt Karas
What strategy or strategies do we use in ATS to make sure that we
don't do blocking I/O that blocks a thread with queued event handlers
(not dependent on the I/O operation)?
Post by Masakazu Kitajo
I have no idea where the delay comes from. I posted its detail on the PR (
https://github.com/apache/trafficserver/issues/3552#issuecomment-440099449 <https://github.com/apache/trafficserver/issues/3552#issuecomment-440099449>).
Thanks,
Masakazu
Post by 宋辰伟
Hi maskit and koshiba
What does 30ms delay means ? How to reproduce it ?
SCW00
seem
Loading...