fi_usnic(7) Libfabric Programmer's Manual
NAME
The usNIC Fabric Provider
OVERVIEW
The usnic provider is designed to run over the Cisco VIC (virtualized NIC) hardware on Cisco UCS servers. It utilizes the Cisco usNIC (userspace NIC) capabilities of the VIC to enable ultra low latency and other offload capabilities on Ethernet networks.
RELEASE NOTES
- The usnic libfabric provider requires the use of the “libnl”
library.
- There are two versions of libnl generally available: v1 and v3; the usnic provider can use either version.
- If you are building libfabric/the usnic provider from source, you will need to have the libnl header files available (e.g., if you are installing libnl from RPM or other packaging system, install the “-devel” versions of the package).
- If you have libnl (either v1 or v3) installed in a non-standard
location (e.g., not in /usr/lib or /usr/lib64), you may need to
tell libfabric’s configure where to find libnl via the
--with-libnl=DIR
command line option (where DIR is the installation prefix of the libnl package).
- The most common way to use the libfabric usnic provider is via an MPI implementation that uses libfabric (and the usnic provider) as a lower layer transport. MPI applications do not need to know anything about libfabric or usnic in this use case – the MPI implementation hides all these details from the application.
- If you are writing applications directly to the libfabric API:
- FI_EP_DGRAM endpoints are the best supported method of utilizing the usNIC interface. Specifically, the FI_EP_DGRAM endpoint type has been extensively tested as the underlying layer for Open MPI’s usnic BTL.
- FI_EP_MSG and FI_EP_RDM endpoints are implemented, but are only lightly tested. It is likely that there are still some bugs in these endpoint types. In particular, there are known bugs in RDM support in the presence of congestion or packet loss (issue 1621). RMA is not yet supported.
fi_provider
(7) lists requirements for all providers. The following limitations exist in the usnic provider:- multicast operations are not supported on FI_EP_DGRAM and FI_EP_RDM endpoints.
- FI_EP_MSG endpoints only support connect, accept, and getname CM operations.
- Passive endpoints only support listen, setname, and getname CM operations.
- FI_EP_DGRAM endpoints support
fi_sendmsg()
andfi_recvmsg()
, but some flags are ignored.fi_sendmsg()
supportsFI_INJECT
andFI_COMPLETION
.fi_recvmsg()
supportsFI_MORE
. - Address vectors only support
FI_AV_MAP
. - No counters are supported.
- The tag matching interface is not supported.
- FI_MSG_PREFIX is only supported on FI_EP_DGRAM and usage is limited to releases 1.1 and beyond.
- fi_control with FI_GETWAIT may only be used on CQs that have been bound to an endpoint. If fi_control is used on an unbound CQ, it will return -FI_EOPBADSTATE.
- Resource management is not supported. The application is responsible for resource protection.
- The usnic libfabric provider supports extensions that provide information and functionality beyond the standard libfabric interface. See the “USNIC EXTENSIONS” section, below.
USNIC EXTENSIONS
The usnic libfabric provider exports extensions for additional VIC, usNIC, and Ethernet capabilities not provided by the standard libfabric interface.
These extensions are available via the “fi_ext_usnic.h” header file.
Fabric Extension: getinfo
Version 2 of the “fabric getinfo” extension was introduced in Libfabric release
v1.3.0 and can be used to retrieve IP and SR-IOV information about a usNIC
device obtained from the fi_getinfo
(3) function.
The “fabric getinfo” extension is obtained by calling fi_open_ops
and
requesting FI_USNIC_FABRIC_OPS_1
to get the usNIC fabric extension
operations. The getinfo
function accepts a version parameter that can be
used to select different versions of the extension. The information returned by
the “fabric getinfo” extension is accessible through a fi_usnic_info
struct
that uses a version tagged union. The accessed union member must correspond
with the requested version. It is recommended that applications explicitly
request a version rather than using the header provided
FI_EXT_USNIC_INFO_VERSION
. Although there is a version 1 of the extension,
its use is discouraged, and it may not be available in future releases.
Compatibility issues
The addition of version 2 of the extension caused an alignmment issue that could lead to invalid data in the v1 portion of the structure. This means that the alignment difference manifests when an application using v1 of the extension is compiled with Libfabric v1.1.x or v1.2.x, but then runs with Libfabric.so that is v1.3.x or higher (and vice versa).
The v1.4.0 release of Libfabric introduced a padding field to explicitly maintain compatibility with the v1.3.0 release. If the issue is encountered, then it is recommended that you upgrade to a release containing version 2 of the extension, or recompile with a patched version of an older release.
#include <rdma/fi_ext_usnic.h>
struct fi_usnic_info {
uint32_t ui_version;
uint8_t ui_pad0[4];
union {
struct fi_usnic_info_v1 v1;
struct fi_usnic_info_v2 v2;
} ui;
} __attribute__((packed));
int getinfo(uint32_t version, struct fid_fabric *fabric,
struct fi_usnic_info *info);
- version
- Version of getinfo to be used
- fabric
- Fabric descriptor
- info
- Upon successful return, this parameter will contain information about the fabric.
- Version 2
struct fi_usnic_cap {
const char *uc_capability;
int uc_present;
} __attribute__((packed));
struct fi_usnic_info_v2 {
uint32_t ui_link_speed;
uint32_t ui_netmask_be;
char ui_ifname[IFNAMSIZ];
unsigned ui_num_vf;
unsigned ui_qp_per_vf;
unsigned ui_cq_per_vf;
char ui_devname[FI_EXT_USNIC_MAX_DEVNAME];
uint8_t ui_mac_addr[6];
uint8_t ui_pad0[2];
uint32_t ui_ipaddr_be;
uint32_t ui_prefixlen;
uint32_t ui_mtu;
uint8_t ui_link_up;
uint8_t ui_pad1[3];
uint32_t ui_vendor_id;
uint32_t ui_vendor_part_id;
uint32_t ui_device_id;
char ui_firmware[64];
unsigned ui_intr_per_vf;
unsigned ui_max_cq;
unsigned ui_max_qp;
unsigned ui_max_cqe;
unsigned ui_max_send_credits;
unsigned ui_max_recv_credits;
const char *ui_nicname;
const char *ui_pid;
struct fi_usnic_cap **ui_caps;
} __attribute__((packed));
- Version 1
struct fi_usnic_info_v1 {
uint32_t ui_link_speed;
uint32_t ui_netmask_be;
char ui_ifname[IFNAMSIZ];
uint32_t ui_num_vf;
uint32_t ui_qp_per_vf;
uint32_t ui_cq_per_vf;
} __attribute__((packed));
Version 1 of the “fabric getinfo” extension can be used by explicitly
requesting it in the call to getinfo
and accessing the v1
portion of the
fi_usnic_info.ui
union. Use of version 1 is not recommended and it may be
removed from future releases.
The following is an example of how to utilize version 2 of the usnic “fabric getinfo” extension.
#include <stdio.h>
#include <rdma/fabric.h>
/* The usNIC extensions are all in the
rdma/fi_ext_usnic.h header */
#include <rdma/fi_ext_usnic.h>
int main(int argc, char *argv[]) {
struct fi_info *info;
struct fi_info *info_list;
struct fi_info hints = {0};
struct fi_ep_attr ep_attr = {0};
struct fi_fabric_attr fabric_attr = {0};
fabric_attr.prov_name = "usnic";
ep_attr.type = FI_EP_DGRAM;
hints.caps = FI_MSG;
hints.mode = FI_LOCAL_MR | FI_MSG_PREFIX;
hints.addr_format = FI_SOCKADDR;
hints.ep_attr = &ep_attr;
hints.fabric_attr = &fabric_attr;
/* Find all usnic providers */
fi_getinfo(FI_VERSION(1, 0), NULL, 0, 0, &hints, &info_list);
for (info = info_list; NULL != info; info = info->next) {
/* Open the fabric on the interface */
struct fid_fabric *fabric;
fi_fabric(info->fabric_attr, &fabric, NULL);
/* Pass FI_USNIC_FABRIC_OPS_1 to get usnic ops
on the fabric */
struct fi_usnic_ops_fabric *usnic_fabric_ops;
fi_open_ops(&fabric->fid, FI_USNIC_FABRIC_OPS_1, 0,
(void **) &usnic_fabric_ops, NULL);
/* Now use the returned usnic ops structure to call
usnic extensions. The following extension queries
some IP and SR-IOV characteristics about the
usNIC device. */
struct fi_usnic_info usnic_info;
/* Explicitly request version 2. */
usnic_fabric_ops->getinfo(2, fabric, &usnic_info);
printf("Fabric interface %s is %s:\n"
"\tNetmask: 0x%08x\n\tLink speed: %d\n"
"\tSR-IOV VFs: %d\n\tQPs per SR-IOV VF: %d\n"
"\tCQs per SR-IOV VF: %d\n",
info->fabric_attr->name,
usnic_info.ui.v2.ui_ifname,
usnic_info.ui.v2.ui_netmask_be,
usnic_info.ui.v2.ui_link_speed,
usnic_info.ui.v2.ui_num_vf,
usnic_info.ui.v2.ui_qp_per_vf,
usnic_info.ui.v2.ui_cq_per_vf);
fi_close(&fabric->fid);
}
fi_freeinfo(info_list);
return 0;
}
Adress Vector Extension: get_distance
The “address vector get_distance” extension was introduced in Libfabric release v1.0.0 and can be used to retrieve the network distance of an address.
The “get_distance” extension is obtained by calling fi_open_ops
and
requesting FI_USNIC_AV_OPS_1
to get the usNIC address vector extension
operations.
int get_distance(struct fid_av *av, void *addr, int *metric);
- av
- Address vector
- addr
- Destination address
- metric
- On output this will contain
-1
if the destination host is unreachable,0
is the destination host is locally connected, and1
otherwise.
See fi_ext_usnic.h for more details.
VERSION DIFFERENCES
New naming convention for fabric/domain starting with libfabric v1.4
The release of libfabric v1.4 introduced a new naming convention for fabric and domain. However the usNIC provider
remains backward compatible with applications supporting the old scheme and decides which one to use based on
the version passed to fi_getinfo
:
- When
FI_VERSION(1,4)
or higher is used:- fabric name is the network address with the CIDR notation (i.e.,
a.b.c.d/e
) - domain name is the usNIC Linux interface name (i.e.,
usnic_X
)
- fabric name is the network address with the CIDR notation (i.e.,
- When a lower version number is used, like
FI_VERSION(1, 3)
, it follows the same behavior the usNIC provider exhibited in libfabric <= v1.3:- fabric name is the usNIC Linux interface name (i.e.,
usnic_X
) - domain name is
NULL
- fabric name is the usNIC Linux interface name (i.e.,