rfc9611.original   rfc9611.txt 
Network A. Antony Internet Engineering Task Force (IETF) A. Antony
Internet-Draft secunet Request for Comments: 9611 secunet
Intended status: Standards Track T. Brunner Category: Standards Track T. Brunner
Expires: 3 November 2024 codelabs ISSN: 2070-1721 codelabs
S. Klassert S. Klassert
secunet secunet
P. Wouters P. Wouters
Aiven Aiven
2 May 2024 July 2024
IKEv2 support for per-resource Child SAs Internet Key Exchange Protocol Version 2 (IKEv2) Support for
draft-ietf-ipsecme-multi-sa-performance-09 Per-Resource Child Security Associations (SAs)
Abstract Abstract
This document defines one Notify Message Status Types and one Notify This document defines one Notify Message Status Types payload and one
Message Error Types payload for the Internet Key Exchange Protocol Notify Message Error Types payload for the Internet Key Exchange
Version 2 (IKEv2) to support the negotiation of multiple Child Protocol Version 2 (IKEv2) to support the negotiation of multiple
Security Associations (SAs) with the same Traffic Selectors used on Child Security Associations (SAs) with the same Traffic Selectors
different resources, such as CPUs, to increase bandwidth of IPsec used on different resources, such as CPUs, to increase bandwidth of
traffic between peers. IPsec traffic between peers.
The SA_RESOURCE_INFO notification is used to convey information that The SA_RESOURCE_INFO notification is used to convey information that
the negotiated Child SA and subsequent new Child SAs with the same the negotiated Child SA and subsequent new Child SAs with the same
Traffic Selectors are a logical group of Child SAs where most or all Traffic Selectors are a logical group of Child SAs where most or all
of the Child SAs are bound to a specific resource, such as a specific of the Child SAs are bound to a specific resource, such as a specific
CPU. The TS_MAX_QUEUE notify conveys that the peer is unwilling to CPU. The TS_MAX_QUEUE notify conveys that the peer is unwilling to
create more additional Child SAs for this particular negotiated create more additional Child SAs for this particular negotiated
Traffic Selector combination. Traffic Selector combination.
Using multiple Child SAs with the same Traffic Selectors has the Using multiple Child SAs with the same Traffic Selectors has the
benefit that each resource holding the Child SA has its own Sequence benefit that each resource holding the Child SA has its own Sequence
Number Counter, ensuring that CPUs don't have to synchronize their Number Counter, ensuring that CPUs don't have to synchronize their
cryptographic state or disable their packet replay protection. cryptographic state or disable their packet replay protection.
Status of This Memo Status of This Memo
This Internet-Draft is submitted in full conformance with the This is an Internet Standards Track document.
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months This document is a product of the Internet Engineering Task Force
and may be updated, replaced, or obsoleted by other documents at any (IETF). It represents the consensus of the IETF community. It has
time. It is inappropriate to use Internet-Drafts as reference received public review and has been approved for publication by the
material or to cite them other than as "work in progress." Internet Engineering Steering Group (IESG). Further information on
Internet Standards is available in Section 2 of RFC 7841.
This Internet-Draft will expire on 3 November 2024. Information about the current status of this document, any errata,
and how to provide feedback on it may be obtained at
https://www.rfc-editor.org/info/rfc9611.
Copyright Notice Copyright Notice
Copyright (c) 2024 IETF Trust and the persons identified as the Copyright (c) 2024 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/ Provisions Relating to IETF Documents
license-info) in effect on the date of publication of this document. (https://trustee.ietf.org/license-info) in effect on the date of
Please review these documents carefully, as they describe your rights publication of this document. Please review these documents
and restrictions with respect to this document. Code Components carefully, as they describe your rights and restrictions with respect
extracted from this document must include Revised BSD License text as to this document. Code Components extracted from this document must
described in Section 4.e of the Trust Legal Provisions and are include Revised BSD License text as described in Section 4.e of the
provided without warranty as described in the Revised BSD License. Trust Legal Provisions and are provided without warranty as described
in the Revised BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1. Introduction
1.1. Requirements Language . . . . . . . . . . . . . . . . . . 3 1.1. Requirements Language
1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 1.2. Terminology
2. Performance bottlenecks . . . . . . . . . . . . . . . . . . . 4 2. Performance Bottlenecks
3. Negotiation of CPU specific Child SAs . . . . . . . . . . . . 4 3. Negotiation of CPU-Specific Child SAs
4. Implementation Considerations . . . . . . . . . . . . . . . . 5 4. Implementation Considerations
5. Payload Format . . . . . . . . . . . . . . . . . . . . . . . 6 5. Payload Format
5.1. SA_RESOURCE_INFO Notify Message Status Type payload . . . 6 5.1. SA_RESOURCE_INFO Notify Message Status Type Payload
5.2. TS_MAX_QUEUE Notify Message Error Type Payload . . . . . 7 5.2. TS_MAX_QUEUE Notify Message Error Type Payload
6. Operational Considerations . . . . . . . . . . . . . . . . . 7 6. Operational Considerations
7. Security Considerations . . . . . . . . . . . . . . . . . . . 8 7. Security Considerations
8. Implementation Status . . . . . . . . . . . . . . . . . . . . 9 8. IANA Considerations
8.1. Linux XFRM . . . . . . . . . . . . . . . . . . . . . . . 9 9. References
8.2. Libreswan . . . . . . . . . . . . . . . . . . . . . . . . 10 9.1. Normative References
8.3. strongSwan . . . . . . . . . . . . . . . . . . . . . . . 11 9.2. Informative References
8.4. iproute2 . . . . . . . . . . . . . . . . . . . . . . . . 11 Acknowledgements
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 11 Authors' Addresses
10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 12
11. References . . . . . . . . . . . . . . . . . . . . . . . . . 12
11.1. Normative References . . . . . . . . . . . . . . . . . . 12
11.2. Informative References . . . . . . . . . . . . . . . . . 12
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 13
1. Introduction 1. Introduction
Most IPsec implementations are currently limited to using one Most IPsec implementations are currently limited to using one
hardware queue or a single CPU resource for a Child SA. Running hardware queue or a single CPU resource for a Child SA. Running
packet stream encryption in parallel can be done, but there is a packet stream encryption in parallel can be done, but there is a
bottleneck of different parts of the hardware locking or waiting to bottleneck of different parts of the hardware locking or waiting to
get their sequence number assigned for the packet it is encrypting. get their sequence number assigned for the packet it is encrypting.
The result is that a machine with many such resources is limited to The result is that a machine with many such resources is limited to
only using one of these resources per Child SA. This severely limits using only one of these resources per Child SA. This severely limits
the throughput that can be attained. For example, at the time of the throughput that can be attained. For example, at the time of
writing, an unencrypted link of 10Gbps or more is commonly reduced to writing, an unencrypted link of 10 Gbps or more is commonly reduced
2-5Gbps when IPsec is used to encrypt the link using AES-GCM. By to 2-5 Gbps when IPsec is used to encrypt the link using AES-GCM. By
using the implementation specified in this document, aggregate using the implementation specified in this document, aggregate
throughput increased from 5Gbps using 1 CPU to 40-60 Gbps using 25-30 throughput increased from 5Gbps using 1 CPU to 40-60 Gbps using 25-30
CPUs. CPUs.
While this could be (partially) mitigated by setting up multiple While this could be (partially) mitigated by setting up multiple
narrowed Child SAs, for example using Populate From Packet (PFP) as narrowed Child SAs (for example, using Populate From Packet (PFP) as
specified in IPsec Architecture [RFC4301], this IPsec feature would specified in IPsec architecture [RFC4301]), this IPsec feature would
cause too many Child SAs (one per network flow) or too few Child SAs cause too many Child SAs (one per network flow) or too few Child SAs
(one network flow used on multiple CPUs). PFP is also not widely (one network flow used on multiple CPUs). PFP is also not widely
implemented. implemented.
To make better use of multiple network queues and CPUs, it can be To make better use of multiple network queues and CPUs, it can be
beneficial to negotiate and install multiple Child SAs with identical beneficial to negotiate and install multiple Child SAs with identical
Traffic Selectors. IKEv2 [RFC7296] already allows installing Traffic Selectors. IKEv2 [RFC7296] already allows installing
multiple Child SAs with identical Traffic Selectors, but it offers no multiple Child SAs with identical Traffic Selectors, but it offers no
method to indicate that the additional Child SA is being requested method to indicate that the additional Child SA is being requested
for performance increase reasons and is restricted to some resource for performance increase reasons and is restricted to some resource
(queue or CPU). (queue or CPU).
When an IKEv2 peer is receiving more additional Child SA's for a When an IKEv2 peer is receiving more additional Child SAs for a
single set of Traffic Selectors than it is willing to create, it can single set of Traffic Selectors than it is willing to create, it can
return an error notify of TS_MAX_QUEUE. return an error notify of TS_MAX_QUEUE.
1.1. Requirements Language 1.1. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP "OPTIONAL" in this document are to be interpreted as described in
14 [RFC2119] [RFC8174] when, and only when, they appear in all BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here. capitals, as shown here.
1.2. Terminology 1.2. Terminology
This document uses the following terms defined in IKEv2 [RFC7296]: This document uses the following terms defined in IKEv2 [RFC7296]:
Notification Data, Traffic Selectors (TS), TSi/TSr, Child SA, Notification Data, Traffic Selector (TS), Traffic Selector initiator
Configuration Payload (CP), IKE SA, CREATE_CHILD_SA and (TSi), Traffic Selector responder (TSr), Child SA, Configuration
NO_ADDITIONAL_SAS. Payload (CP), IKE SA, CREATE_CHILD_SA, and NO_ADDITIONAL_SAS.
This document also uses the following terms defined in [RFC4301]: This document also uses the following terms defined in [RFC4301]:
SPD, SA. Security Policy Database (SPD), SA.
2. Performance bottlenecks 2. Performance Bottlenecks
There are several pragmatic reasons why most implementations must There are several pragmatic reasons why most implementations must
restrict a Child Security Association (SA) to a single specific restrict a Child Security Association (SA) to a single specific
hardware resource. A primary limitation arises from the challenges hardware resource. A primary limitation arises from the challenges
associated with sharing cryptographic states, counters, and sequence associated with sharing cryptographic states, counters, and sequence
numbers among multiple CPUs. When these CPUs attempt to numbers among multiple CPUs. When these CPUs attempt to
simultaneously utilize shared states, it becomes impractical to do so simultaneously utilize shared states, it becomes impractical to do so
without incurring a significant performance penalty. It is necessary without incurring a significant performance penalty. It is necessary
to negotiate and establish multiple Child Security Associations (SAs) to negotiate and establish multiple Child SAs with identical Traffic
with identical Traffic Selector initiator (TSi) and Traffic Selector Selector initiator (TSi) and Traffic Selector responder (TSr) on a
responder (TSr) on a per-resource basis." per-resource basis.
3. Negotiation of CPU specific Child SAs 3. Negotiation of CPU-Specific Child SAs
An initial IKEv2 exchange is used to setup an IKE SA and the initial An initial IKEv2 exchange is used to set up an IKE SA and the initial
Child SA. If multiple Child SAs with the same Traffic Selectors that Child SA. If multiple Child SAs with the same Traffic Selectors that
are bound to a single resource are desired, the initiator will add are bound to a single resource are desired, the initiator will add
the SA_RESOURCE_INFO notify payload to the Exchange negotiating the the SA_RESOURCE_INFO notify payload to the Exchange negotiating the
Child SA (e.g. IKE_AUTH or CREATE_CHILD_SA). If this initial Child Child SA (e.g., IKE_AUTH or CREATE_CHILD_SA). If this initial Child
SA will be tied to a specific resource, it MAY indicate this by SA will be tied to a specific resource, it MAY indicate this by
including an identifier in the Notification Data. A responder that including an identifier in the Notification Data. A responder that
is willing to have multiple Child SAs for the same Traffic Selectors is willing to have multiple Child SAs for the same Traffic Selectors
will respond by also adding the SA_RESOURCE_INFO notify payload in will respond by also adding the SA_RESOURCE_INFO notify payload in
which it MAY add a non-zero Notify Data. which it MAY add a non-zero Notify Data.
Additional resource-specific Child SAs are negotiated as regular Additional resource-specific Child SAs are negotiated as regular
Child SAs using the CREATE_CHILD_SA exchange and are similarly Child SAs using the CREATE_CHILD_SA exchange and are similarly
identified by an accompanying SA_RESOURCE_INFO notification. identified by an accompanying SA_RESOURCE_INFO notification.
Upon installation, each resource-specific Child SA is associated with Upon installation, each resource-specific Child SA is associated with
an additional local selector, such as the CPU. These resource- an additional local selector, such as the CPU. These resource-
specific Child SAs MUST be negotiated with identical Child SA specific Child SAs MUST be negotiated with identical Child SA
properties that were negotiated for the initial Child SA. This properties that were negotiated for the initial Child SA. This
includes cryptographic algorithms, Traffic Selectors, Mode (e.g. includes cryptographic algorithms, Traffic Selectors, Mode (e.g.,
transport mode), compression usage, etc. However, each Child SA does transport mode), compression usage, etc. However, each Child SA does
have its own keying material that is individually derived according have its own keying material that is individually derived according
to the regular IKEv2 process. The SA_RESOURCE_INFO notify payload to the regular IKEv2 process. The SA_RESOURCE_INFO notify payload
MAY be empty or MAY contain some identifying data. This identifying MAY be empty or MAY contain some identifying data. This identifying
data SHOULD be a unique identifier within all the Child SAs with the data SHOULD be a unique identifier within all the Child SAs with the
same TS payloads and the peer MUST only use it for debugging same TS payloads, and the peer MUST only use it for debugging
purposes. purposes.
Additional Child SAs can be started on-demand or can be started all Additional Child SAs can be started on demand or can be started all
at once. Peers may also delete specific per-resource Child SAs if at once. Peers may also delete specific per-resource Child SAs if
they deem the associated resource to be idle. they deem the associated resource to be idle.
During the CREATE_CHILD_SA rekey for the Child SA, the During the CREATE_CHILD_SA rekey for the Child SA, the
SA_RESOURCE_INFO notification MAY be included, but regardless of SA_RESOURCE_INFO notification MAY be included, but regardless of
whether or not it is included, the rekeyed Child SA should be bound whether or not it is included, the rekeyed Child SA should be bound
to the same resource(s) as the Child SA that is being rekeyed. to the same resource(s) as the Child SA that is being rekeyed.
4. Implementation Considerations 4. Implementation Considerations
skipping to change at page 5, line 37 skipping to change at line 210
can still encrypt its packets using the Child SA that is available can still encrypt its packets using the Child SA that is available
for all CPUs. Alternatively, if an implementation finds it needs to for all CPUs. Alternatively, if an implementation finds it needs to
encrypt a packet but the current CPU does not have the resources to encrypt a packet but the current CPU does not have the resources to
encrypt this packet, it can relay that packet to a specific CPU that encrypt this packet, it can relay that packet to a specific CPU that
does have the capability to encrypt the packet, although this will does have the capability to encrypt the packet, although this will
come with a performance penalty. come with a performance penalty.
Performing per-CPU Child SA negotiations can result in both peers Performing per-CPU Child SA negotiations can result in both peers
initiating additional Child SAs at once. This is especially likely initiating additional Child SAs at once. This is especially likely
if per-CPU Child SAs are triggered by individual SADB_ACQUIRE if per-CPU Child SAs are triggered by individual SADB_ACQUIRE
[RFC2367] messages. Responders should install the additional Child messages [RFC2367]. Responders should install the additional Child
SA on a CPU with the least amount of additional Child SAs for this SA on a CPU with the least amount of additional Child SAs for this
TSi/TSr pair. TSi/TSr pair.
When the number of queue or CPU resources are different between the When the number of queue or CPU resources are different between the
peers, the peer with the least amount of resources may decide to not peers, the peer with the least amount of resources may decide to not
install a second outbound Child SA for the same resource as it will install a second outbound Child SA for the same resource, as it will
never use it to send traffic. However, it must install all inbound never use it to send traffic. However, it must install all inbound
Child SAs as it has committed to receiving traffic on these Child SAs because it has committed to receiving traffic on these
negotiated Child SAs. negotiated Child SAs.
If per-CPU packet trigger (e.g. SADB_ACQUIRE) messages are If per-CPU packet trigger (e.g., SADB_ACQUIRE) messages are
implemented (see Section 6), the Traffic Selector (TSi) entry implemented (see Section 6), the Traffic Selector (TSi) entry
containing the information of the trigger packet should be included containing the information of the trigger packet should be included
in the TS set similarly to regular Child SAs as specified in IKEv2 in the TS set similarly to regular Child SAs as specified in IKEv2
[RFC7296], Section 2.9. Based on the trigger TSi entry, an
[RFC7296] Section 2.9. Based on the trigger TSi entry, an
implementation can select the most optimal target CPU to install the implementation can select the most optimal target CPU to install the
additional Child SA on. For example, if the trigger packet was for a additional Child SA on. For example, if the trigger packet was for a
TCP destination to port 25 (SMTP), it might be able to install the TCP destination to port 25 (SMTP), it might be able to install the
Child SA on the CPU that is also running the mail server process. Child SA on the CPU that is also running the mail server process.
Trigger packet Traffic Selectors are documented in IKEv2 [RFC7296] Trigger packet Traffic Selectors are documented in IKEv2 [RFC7296],
Section 2.9. Section 2.9.
As per IKEv2, rekeying a Child SA SHOULD use the same (or wider) As per IKEv2, rekeying a Child SA SHOULD use the same (or wider)
Traffic Selectors to ensure that the new Child SA covers everything Traffic Selectors to ensure that the new Child SA covers everything
that the rekeyed Child SA covers. This includes Traffic Selectors that the rekeyed Child SA covers. This includes Traffic Selectors
negotiated via Configuration Payloads (CP) such as negotiated via Configuration Payloads such as INTERNAL_IP4_ADDRESS,
INTERNAL_IP4_ADDRESS which may use the original wide TS set or use which may use the original wide TS set or use the narrowed TS set.
the narrowed TS set.
5. Payload Format 5. Payload Format
The Notify Payload format is defined in IKEv2 [RFC7296] section 3.10, The Notify Payload format is defined in IKEv2 [RFC7296],
and is copied here for convenience. Section 3.10, and is copied here for convenience.
All multi-octet fields representing integers are laid out in big All multi-octet fields representing integers are laid out in big
endian order (also known as "most significant byte first", or endian order (also known as "most significant byte first", or
"network byte order"). "network byte order").
5.1. SA_RESOURCE_INFO Notify Message Status Type payload 5.1. SA_RESOURCE_INFO Notify Message Status Type Payload
1 2 3 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-----------------------------+-------------------------------+ +-------------------------------+-------------------------------+
! Next Payload !C! RESERVED ! Payload Length ! ! Next Payload !C! RESERVED ! Payload Length !
+---------------+---------------+-------------------------------+ +---------------+---------------+-------------------------------+
! Protocol ID ! SPI Size ! Notify Message Type ! ! Protocol ID ! SPI Size ! Notify Message Type !
+---------------+---------------+-------------------------------+ +---------------+---------------+-------------------------------+
! ! ! !
~ Resource Identifier (optional) ~ ~ Resource Identifier (optional) ~
! ! ! !
+-------------------------------+-------------------------------+ +-------------------------------+-------------------------------+
* Protocol ID (1 octet) - MUST be 0. MUST be ignored if not 0. Protocol ID (1 octet) - MUST be 0. MUST be ignored if not 0.
* SPI Size (1 octet) - MUST be 0. MUST be ignored if not 0. SPI Size (1 octet) - MUST be 0. MUST be ignored if not 0.
* Notify Status Message Type value (2 octets) - set to [TBD1]. Notify Status Message Type value (2 octets) - set to 16444.
* Resource Identifier (optional). This opaque data may be set to Resource Identifier (optional) - This opaque data may be set to
convey the local identity of the resource. convey the local identity of the resource.
5.2. TS_MAX_QUEUE Notify Message Error Type Payload 5.2. TS_MAX_QUEUE Notify Message Error Type Payload
1 2 3 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+---------------+---------------+-------------------------------+ +---------------+---------------+-------------------------------+
! Next Payload !C! RESERVED ! Payload Length ! ! Next Payload !C! RESERVED ! Payload Length !
+---------------+---------------+-------------------------------+ +---------------+---------------+-------------------------------+
! Protocol ID ! SPI Size ! Notify Message Type ! ! Protocol ID ! SPI Size ! Notify Message Type !
+---------------+---------------+-------------------------------+ +---------------+---------------+-------------------------------+
* Protocol ID (1 octet) - MUST be 0. MUST be ignored if not 0. Protocol ID (1 octet) - MUST be 0. MUST be ignored if not 0.
* SPI Size (1 octet) - MUST be 0. MUST be ignored if not 0. SPI Size (1 octet) - MUST be 0. MUST be ignored if not 0.
* Notify Message Error Type (2 octets) - set to [TBD2] Notify Message Error Type (2 octets) - set to 48.
* There is no data associated with this Notify type. There is no data associated with this Notify type.
6. Operational Considerations 6. Operational Considerations
Implementations supporting per-CPU SAs SHOULD extend their local SPD Implementations supporting per-CPU SAs SHOULD extend their local SPD
selector, and the mechanism of on-demand negotiation that is selector, and the mechanism of on-demand negotiation that is
triggered by traffic to include a CPU (or queue) identifier in their triggered by traffic to include a CPU (or queue) identifier in their
packet trigger (e.g. SADB_ACQUIRE) message from the SPD to the IKE packet trigger (e.g., SADB_ACQUIRE) message from the SPD to the IKE
daemon. An implementation which does not support receiving per-CPU daemon. An implementation that does not support receiving per-CPU
packet trigger messages MAY initiate all its Child SAs immediately packet trigger messages MAY initiate all its Child SAs immediately
upon receiving the (only) packet trigger message it will receive from upon receiving the (only) packet trigger message it will receive from
the IPsec stack. Such implementations also need to be careful when the IPsec stack. Such an implementation also needs to be careful
receiving a Delete Notify request for a per-CPU Child SA, as it has when receiving a Delete Notify request for a per-CPU Child SA, as it
no method to detect when it should bring up such a per-CPU Child SA has no method to detect when it should bring up such a per-CPU Child
again later. And bringing the deleted per-CPU Child SA up again SA again later. Also, bringing the deleted per-CPU Child SA up again
immediately after receiving the Delete Notify might cause an infinite immediately after receiving the Delete Notify might cause an infinite
loop between the peers. Another issue of not bringing up all its loop between the peers. Another issue with not bringing up all its
per-CPU Child SAs is that if the peer acts similarly, the two peers per-CPU Child SAs is that if the peer acts similarly, the two peers
might end up with only the first Child SA without ever activating any might end up with only the first Child SA without ever activating any
per-CPU Child SAs. It is therefor RECOMMENDED to implement per-CPU per-CPU Child SAs. It is therefore RECOMMENDED to implement per-CPU
packet trigger messages. packet trigger messages.
Peers SHOULD be flexible with the maximum number of Child SAs they Peers SHOULD be flexible with the maximum number of Child SAs they
allow for a given TSi/TSr combination to account for corner cases. allow for a given TSi/TSr combination in order to account for corner
For example, during Child SA rekeying, there might be a large number cases. For example, during Child SA rekeying, there might be a large
of additional Child SAs created before the old Child SAs are torn number of additional Child SAs created before the old Child SAs are
down. Similarly, when using on-demand Child SAs, both ends could torn down. Similarly, when using on-demand Child SAs, both ends
trigger multiple Child SA requests as the initial packet causing the could trigger multiple Child SA requests as the initial packet
Child SA negotiation might have been transported to the peer via the causing the Child SA negotiation might have been transported to the
first Child SA where its reply packet might also trigger an on-demand peer via the first Child SA, where its reply packet might also
Child SA negotiation to start. As additional Child SAs consume trigger an on-demand Child SA negotiation to start. As additional
little additional resources, allowing at the very least double the Child SAs consume little additional resources, allowing at the very
number of available CPUs is RECOMMENDED. An implementation MAY allow least double the number of available CPUs is RECOMMENDED. An
unlimited additional Child SAs and only limit this number based on implementation MAY allow unlimited additional Child SAs and only
its generic resource protection strategies that are used to require limit this number based on its generic resource protection strategies
COOKIES or refuse new IKE or Child SA negotiations. Although having that are used to require COOKIES or refuse new IKE or Child SA
a very large number (e.g. hundreds or thousands) of SAs may slow down negotiations. Although having a very large number (e.g., hundreds or
per-packet SAD lookup. thousands) of SAs may slow down per-packet SAD lookup.
Implementations might support dynamically moving a per-CPU Child SAs Implementations might support dynamically moving a per-CPU Child SA
from one CPU to another CPU. If this method is supported, from one CPU to another CPU. If this method is supported,
implementations must be careful to move both the inbound and outbound implementations must be careful to move both the inbound and outbound
SAs. If the IPsec endpoint is a gateway, it can move the inbound SA SAs. If the IPsec endpoint is a gateway, it can move the inbound SA
and outbound SA independently of each other. It is likely that for a and outbound SA independently of each other. It is likely that for a
gateway, IPsec traffic would be asymmetric. If the IPsec endpoint is gateway, IPsec traffic would be asymmetric. If the IPsec endpoint is
the same host responsible for generating the traffic, the inbound and the same host responsible for generating the traffic, the inbound and
outbound SAs SHOULD remain as a pair on the same CPU. If a host outbound SAs SHOULD remain as a pair on the same CPU. If a host
previously skipped installing an outbound SA because it would be an previously skipped installing an outbound SA because it would be an
unused duplicate outbound SA, it will have to create and add the unused duplicate outbound SA, it will have to create and add the
previously skipped outbound SA to the SAD with the new CPU ID. The previously skipped outbound SA to the SAD with the new CPU ID. The
inbound SA may not have CPU ID in the SAD. Adding the outbound SA to inbound SA may not have a CPU ID in the SAD. Adding the outbound SA
the SAD requires access to the key material, whereas for updating the to the SAD requires access to the key material, whereas updating the
CPU selector on an existing outbound SAs access to key material might CPU selector on an existing outbound SAs might not require access to
not be needed. To support this, the IKE software might have to hold key material. To support this, the IKE software might have to hold
on to the key material longer than it normally would, as it might on to the key material longer than it normally would, as it might
actively attempt to destroy key material from memory that the IKE actively attempt to destroy key material from memory that the IKE
daemon no longer needs access to. daemon no longer needs access to.
An implementation that does not accept any further resource specific An implementation that does not accept any further resource-specific
Child SAs MUST NOT return the NO_ADDITIONAL_SAS error because this Child SAs MUST NOT return the NO_ADDITIONAL_SAS error because this
can be interpreted by the peer that no other Child SAs with different can be interpreted by the peer that no other Child SAs with different
TSi/TSr are allowed either. Instead, it MUST return TS_MAX_QUEUE. TSi/TSr are allowed either. Instead, it MUST return TS_MAX_QUEUE.
7. Security Considerations 7. Security Considerations
Similar to how an implementation should limit the number of half-open Similar to how an implementation should limit the number of half-open
SAs to limit the impact of a denial of service attack, it is SAs to limit the impact of a denial-of-service attack, it is
RECOMMENDED that an implementation limits the maximum number of RECOMMENDED that an implementation limits the maximum number of
additional Child SAs allowed per unique TSi/TSr. additional Child SAs allowed per unique TSi/TSr.
Using multiple resource specific child SAs makes sense for high Using multiple resource-specific child SAs makes sense for high-
volume IPsec connections on IPsec gateway machines where the volume IPsec connections on IPsec gateway machines where the
administrator has a trust relationship with the peer's administrator administrator has a trust relationship with the peer's administrator
and abuse is unlikely and easily escalated to resolve. and abuse is unlikely and easily escalated to resolve.
This trust relationship is usually not present for the Remote Access This trust relationship is usually not present for the deployments of
VPN type deployments, and allowing per-CPU Child SA's is NOT remote access VPNs, and allowing per-CPU Child SAs is NOT RECOMMENDED
RECOMMENDED in these scenarios. Therefore, it is also NOT in these scenarios. Therefore, it is also NOT RECOMMENDED to allow
RECOMMENDED to allow per-CPU Child SAs per default. per-CPU Child SAs by default.
The SA_RESOURCE_INFO notify contains an optional data payload that The SA_RESOURCE_INFO notify contains an optional data payload that
can be used by the peer to identify the Child SA belonging to a can be used by the peer to identify the Child SA belonging to a
specific resource. The notify data SHOULD NOT be an identifier that specific resource. The notify data SHOULD NOT be an identifier that
can be used to gain information about the hardware. For example, can be used to gain information about the hardware. For example,
using the CPU number itself as identifier might give an attacker using the CPU number itself as the identifier might give an attacker
knowledge which packets are handled by which CPU ID and it might knowledge of which packets are handled by which CPU ID, and it might
optimize a brute force attack against the system. optimize a brute-force attack against the system.
8. Implementation Status
[Note to RFC Editor: Please remove this section and the reference to
[RFC7942] before publication.]
This section records the status of known implementations of the
protocol defined by this specification at the time of posting of this
Internet-Draft, and is based on a proposal described in [RFC7942].
The description of implementations in this section is intended to
assist the IETF in its decision processes in progressing drafts to
RFCs. Please note that the listing of any individual implementation
here does not imply endorsement by the IETF. Furthermore, no effort
has been spent to verify the information presented here that was
supplied by IETF contributors. This is not intended as, and must not
be construed to be, a catalog of available implementations or their
features. Readers are advised to note that other implementations may
exist.
According to [RFC7942], "this will allow reviewers and working groups
to assign due consideration to documents that have the benefit of
running code, which may serve as evidence of valuable experimentation
and feedback that have made the implemented protocols more mature.
It is up to the individual working groups to use this information as
they see fit".
Authors are requested to add a note to the RFC Editor at the top of
this section, advising the Editor to remove the entire section before
publication, as well as the reference to [RFC7942].
8.1. Linux XFRM
Organization: Linux kernel XFRM
Name: XFRM-PCPU-v7
https://git.kernel.org/pub/scm/linux/kernel/git/klassert/linux-
stk.git/log/?h=xfrm-pcpu-v7
Description: An initial Kernel IPsec implementation of the per-CPU
method.
Level of maturity: Alpha
Coverage: Implements a general Child SA and per-CPU Child SAs. It
only supports the NETLINK API. The PFKEYv2 API is not supported.
Licensing: GPLv2
Implementation experience: The Linux XFRM implementation added two
additional attributes to support per-CPU SAs. There is a new
attribute XFRMA_SA_PCPU, u32, for the SAD entry. This attribute
should present on the outgoing SA, per-CPU Child SAs, starting
from 0. This attribute MUST NOT be present on the first XFRM SA.
It is used by the kernel only for the outgoing traffic, (clear to
encrypted). The incoming SAs do not need XFRMA_SA_PCPU attribute.
XFRM stack can not use CPU id on the incoming SA. The kernel
internally sets the value to 0xFFFFFF for the incoming SA and the
initial Child SA that can be used by any CPU. However, one may
add XFRMA_SA_PCPU to the incoming per-CPU SA to steer the ESP
flow, to a specific Q or CPU e.g ethtool ntuple configuration.
The SPD entry has new flag XFRM_POLICY_CPU_ACQUIRE. It should be
set only on the "out" policy. The flag should be disabled when
the policy is a trap policy, without SPD entries. After a
successful negotiation of SA_RESOURCE_INFO, while adding the first
Child SA, the SPD entry can be updated with the
XFRM_POLICY_CPU_ACQUIRE flag. When XFRM_POLICY_CPU_ACQUIRE is
set, the XFRM_MSG_ACQUIRE generated will include the XFRMA_SA_PCPU
attribute.
Contact: Steffen Klassert steffen.klassert@secunet.com
8.2. Libreswan
Organization: The Libreswan Project
Name: pcpu-3 https://libreswan.org/wiki/XFRM_pCPU
Description: An initial IKE implementation of the per-CPU method.
Level of maturity: Alpha
Coverage: implements combining a regular (all-CPUs) Child SA and
per-CPU additional Child SAs
Licensing: GPLv2
Implementation experience: TBD
Contact: Libreswan Development: swan-dev@libreswan.org
8.3. strongSwan
Organization: The StrongSwan Project
Name: StrongSwan https://github.com/strongswan/strongswan/tree/per-
cpu-sas-poc/
Description: An initial IKE implementation of the per-CPU method.
Level of maturity: Alpha
Coverage: implements combining a regular (all-CPUs) Child SA and
per-CPU additional Child SAs
Licensing: GPLv2
Implementation experience: StrongSwan use private space values for
notifications SA_RESOURCE_INFO (40970).
Contact: Tobias Brunner tobias@strongswan.org
8.4. iproute2
Organization: The iproute2 Project
Name: iproute2 https://github.com/antonyantony/iproute2/tree/pcpu-v1
Description: Implemented the per-CPU attributes for the "ip xfrm"
command.
Level of maturity: Alpha
Licensing: GPLv2
Implementation experience: TBD
Contact: Antony Antony antony.antony@secunet.com
9. IANA Considerations
This document defines one new registration for the IANA "IKEv2 Notify
Message Status Types" registry.
Value Notify Message Status Type Reference 8. IANA Considerations
----- ------------------------------ ---------------
[TBD1] SA_RESOURCE_INFO [this document]
Figure 1 IANA has registered one new value in the "IKEv2 Notify Message Status
Types" registry.
This document defines one new registration for the IANA "IKEv2 Notify +=======+============================+===========+
Message Error Types" registry. | Value | Notify Message Status Type | Reference |
+=======+============================+===========+
| 16444 | SA_RESOURCE_INFO | RFC 9611 |
+-------+----------------------------+-----------+
Value Notify Message Error Type Reference Table 1
----- ------------------------------ ---------------
[TBD2] TS_MAX_QUEUE [this document]
Figure 2 IANA has registered one new value in the "IKEv2 Notify Message Error
Types" registry.
10. Acknowledgements +=======+===========================+===========+
| Value | Notify Message Error Type | Reference |
+=======+===========================+===========+
| 48 | TS_MAX_QUEUE | RFC 9611 |
+-------+---------------------------+-----------+
The following people provided reviews and valuable feedback: Roman Table 2
Danyliw, Warren Kumari Tero Kivinen, Murray Kucherawy, John Scudder,
Valery Smyslov, Gunter van de Velde and Eric Vyncke.
11. References 9. References
11.1. Normative References 9.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997, DOI 10.17487/RFC2119, March 1997,
<https://www.rfc-editor.org/info/rfc2119>. <https://www.rfc-editor.org/info/rfc2119>.
[RFC7296] Kaufman, C., Hoffman, P., Nir, Y., Eronen, P., and T. [RFC7296] Kaufman, C., Hoffman, P., Nir, Y., Eronen, P., and T.
Kivinen, "Internet Key Exchange Protocol Version 2 Kivinen, "Internet Key Exchange Protocol Version 2
(IKEv2)", STD 79, RFC 7296, DOI 10.17487/RFC7296, October (IKEv2)", STD 79, RFC 7296, DOI 10.17487/RFC7296, October
2014, <https://www.rfc-editor.org/info/rfc7296>. 2014, <https://www.rfc-editor.org/info/rfc7296>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>. May 2017, <https://www.rfc-editor.org/info/rfc8174>.
11.2. Informative References 9.2. Informative References
[RFC2367] McDonald, D., Metz, C., and B. Phan, "PF_KEY Key [RFC2367] McDonald, D., Metz, C., and B. Phan, "PF_KEY Key
Management API, Version 2", RFC 2367, Management API, Version 2", RFC 2367,
DOI 10.17487/RFC2367, July 1998, DOI 10.17487/RFC2367, July 1998,
<https://www.rfc-editor.org/info/rfc2367>. <https://www.rfc-editor.org/info/rfc2367>.
[RFC4301] Kent, S. and K. Seo, "Security Architecture for the [RFC4301] Kent, S. and K. Seo, "Security Architecture for the
Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, Internet Protocol", RFC 4301, DOI 10.17487/RFC4301,
December 2005, <https://www.rfc-editor.org/info/rfc4301>. December 2005, <https://www.rfc-editor.org/info/rfc4301>.
[RFC7942] Sheffer, Y. and A. Farrel, "Improving Awareness of Running Acknowledgements
Code: The Implementation Status Section", BCP 205,
RFC 7942, DOI 10.17487/RFC7942, July 2016, The following people provided reviews and valuable feedback: Roman
<https://www.rfc-editor.org/info/rfc7942>. Danyliw, Warren Kumari, Tero Kivinen, Murray Kucherawy, John Scudder,
Valery Smyslov, Gunter van de Velde, and Éric Vyncke.
Authors' Addresses Authors' Addresses
Antony Antony Antony Antony
secunet Security Networks AG secunet Security Networks AG
Email: antony.antony@secunet.com Email: antony.antony@secunet.com
Tobias Brunner Tobias Brunner
codelabs GmbH codelabs GmbH
Email: tobias@codelabs.ch Email: tobias@codelabs.ch
 End of changes. 65 change blocks. 
289 lines changed or deleted 152 lines changed or added

This html diff was produced by rfcdiff 1.48.