[lvc-project] [PATCH] drivers:atlx: Prevent integer overflow in statistics aggregation
Keller, Jacob E
jacob.e.keller at intel.com
Tue Oct 8 20:13:53 MSK 2024
> -----Original Message-----
> From: Rand Deeb <rand.sec96 at gmail.com>
> Sent: Tuesday, October 8, 2024 9:59 AM
> To: Jakub Kicinski <kuba at kernel.org>
> Cc: Chris Snook <chris.snook at gmail.com>; David S . Miller
> <davem at davemloft.net>; Eric Dumazet <edumazet at google.com>; Paolo Abeni
> <pabeni at redhat.com>; Christian Marangi <ansuelsmth at gmail.com>;
> netdev at vger.kernel.org; linux-kernel at vger.kernel.org; deeb.rand at confident.ru;
> lvc-project at linuxtesting.org; voskresenski.stanislav at confident.ru
> Subject: Re: [PATCH] drivers:atlx: Prevent integer overflow in statistics aggregation
>
> On Tue, Oct 8, 2024 at 3:27 AM Jakub Kicinski <kuba at kernel.org> wrote:
> >
> > On Mon, 7 Oct 2024 12:29:36 +0300 Rand Deeb wrote:
> > > The `atl1_inc_smb` function aggregates various RX and TX error counters
> > > from the `stats_msg_block` structure. Currently, the arithmetic operations
> > > are performed using `u32` types, which can lead to integer overflow when
> > > summing large values. This overflow occurs before the result is cast to
> > > a `u64`, potentially resulting in inaccurate network statistics.
> > >
> > > To mitigate this risk, each operand in the summation is explicitly cast to
> > > `u64` before performing the addition. This ensures that the arithmetic is
> > > executed in 64-bit space, preventing overflow and maintaining accurate
> > > statistics regardless of the system architecture.
> >
> > Thanks for the nice commit message, but honestly I don't think
> > the error counters can overflow u32 on an ancient NIC like this.
>
2^32-1 = 4294967295
If we assume that the card operates for at least 10 years, you will need an error rate of ~14 per second to overflow a 32bit counter over the 10 year period. Longer operation uptime time could decrease the error rate. That does seem unlikely.
> Hi Jakub,
>
> Thanks for your feedback, much appreciated!
>
> Honestly, when I was investigating this, I had the same thoughts regarding
> the possibility of the counters overflowing. However, I want to clarify
> that the variables where we store the results of these summations (like
> new_rx_errors, new_tx_errors, etc.) are already u64 types. Given that, it
> seems logical to cast the operands to u64 before the addition to ensure
> consistency and avoid any potential issues during the summation.
>
> Additionally, all counters in the atl1_sft_stats structure are also
> defined as u64, which reinforces the rationale for casting the operands in
> the summation as well.
>
> Thanks again for your input!
>
> Best regards,
> Rand Deeb
Still, if the resulting storage is already 64bit, I think it does make sense to do the arithmetic in 64bit.
More information about the lvc-project
mailing list