Skip to content

Commit 345c07a

Browse files
committed
usnic: require libfabric >= v1.3 at run time
There are critical usnic libfabric AV insert bugs before v1.3, so don't allow any version prior to v1.3 at run time (still allow *compiling* with earlier versions, though, since the ABI guarantees allow us to compile with an earlier libfabric and run with a later libfabric). Switch to using fi_version() to check the version (instead of calling fi_getinfo()) as a potentially lighter-weight / simpler solution. This allows us to only call fi_getinfo() once. Signed-off-by: Jeff Squyres <[email protected]>
1 parent b138138 commit 345c07a

File tree

1 file changed

+76
-53
lines changed

1 file changed

+76
-53
lines changed

opal/mca/btl/usnic/btl_usnic_component.c

Lines changed: 76 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -590,25 +590,6 @@ static void free_filter(usnic_if_filter_t *filter)
590590
free(filter);
591591
}
592592

593-
static int do_fi_getinfo(uint32_t version, struct fi_info **info_list)
594-
{
595-
struct fi_info hints = {0};
596-
struct fi_ep_attr ep_attr = {0};
597-
struct fi_fabric_attr fabric_attr = {0};
598-
599-
/* We only want providers named "usnic" that are of type EP_DGRAM */
600-
fabric_attr.prov_name = "usnic";
601-
ep_attr.type = FI_EP_DGRAM;
602-
603-
hints.caps = FI_MSG;
604-
hints.mode = FI_LOCAL_MR | FI_MSG_PREFIX;
605-
hints.addr_format = FI_SOCKADDR;
606-
hints.ep_attr = &ep_attr;
607-
hints.fabric_attr = &fabric_attr;
608-
609-
return fi_getinfo(version, NULL, 0, 0, &hints, info_list);
610-
}
611-
612593
/*
613594
* UD component initialization:
614595
* (1) read interface list from kernel and compare against component
@@ -652,40 +633,61 @@ static mca_btl_base_module_t** usnic_component_init(int* num_btl_modules,
652633

653634
OBJ_CONSTRUCT(&btl_usnic_lock, opal_recursive_mutex_t);
654635

655-
/* This code understands libfabric API versions v1.0, v1.1, and
656-
v1.4. Even if we were compiled with libfabric API v1.0, we
657-
still want to request v1.1 -- here's why:
658-
659-
- In libfabric v1.0.0 (i.e., API v1.0), the usnic provider did
660-
not check the value of the "version" parameter passed into
661-
fi_getinfo()
662-
663-
- If you pass FI_VERSION(1,0) to libfabric v1.1.0 (i.e., API
664-
v1.1), the usnic provider will disable FI_MSG_PREFIX support
665-
(on the assumption that the application will not handle
666-
FI_MSG_PREFIX properly). This can happen if you compile OMPI
667-
against libfabric v1.0.0 (i.e., API v1.0) and run OMPI
668-
against libfabric v1.1.0 (i.e., API v1.1).
669-
670-
So never request API v1.0 -- always request a minimum of
671-
v1.1.
672-
673-
The usnic provider changed the strings in the fabric and domain
674-
names in API v1.4. With API <= v1.3:
636+
/* There are multiple dimensions to consider when requesting an
637+
API version number from libfabric:
638+
639+
1. This code understands libfabric API versions v1.0 through
640+
v1.4.
641+
642+
2. Open MPI may be *compiled* against one version of libfabric,
643+
but may be *running* with another.
644+
645+
3. There were usnic-specific bugs in Libfabric prior to
646+
libfabric v1.3.0 (where "v1.3.0" is the tarball/package
647+
version, not the API version; but happily, the API version
648+
was also 1.3 in Libfabric v1.3.0):
649+
650+
- In libfabric v1.0.0 (i.e., API v1.0), the usnic provider
651+
did not check the value of the "version" parameter passed
652+
into fi_getinfo()
653+
- If you pass FI_VERSION(1,0) to libfabric v1.1.0 (i.e., API
654+
v1.1), the usnic provider will disable FI_MSG_PREFIX
655+
support (on the assumption that the application will not
656+
handle FI_MSG_PREFIX properly). This can happen if you
657+
compile OMPI against libfabric v1.0.0 (i.e., API v1.0) and
658+
run OMPI against libfabric v1.1.0 (i.e., API v1.1).
659+
- Some critical AV bug fixes were included in libfabric
660+
v1.3.0; prior versions can fail in fi_av_* operations in
661+
unexpected ways (libnl: you win again!).
662+
663+
So always request a minimum API version of v1.3.
664+
665+
Note that the FI_MAJOR_VERSION and FI_MINOR_VERSION in
666+
<rdma/fabric.h> represent the API version, not the Libfabric
667+
package (i.e., tarball) version. As of Libfabric v1.3, there
668+
is currently no way to know a) what package version of
669+
Libfabric you were compiled against, and b) what package
670+
version of Libfabric you are running with.
671+
672+
Also note that the usnic provider changed the strings in the
673+
fabric and domain names in API v1.4. With API <= v1.3:
675674
676675
- fabric name is "usnic_X" (device name)
677676
- domain name is NULL
678677
679-
With libfabric API >= v1.4:
678+
With libfabric API >= v1.4, all Libfabric IP-based providers
679+
(including usnic) follow the same convention:
680680
681681
- fabric name is "a.b.c.d/e" (CIDR notation of network)
682682
- domain name is "usnic_X" (device name)
683683
684684
NOTE: The configure.m4 in this component will require libfabric
685-
>= v1.1.0 (i.e., it won't accept v1.0.0) because of a critical
686-
bug in the usnic provider in libfabric v1.0.0. However, the
687-
compatibility code with libfabric v1.0.0 in the usNIC BTL has
688-
been retained, for two reasons:
685+
>= v1.1.0 (i.e., it won't accept v1.0.0) because it needs
686+
access to the usNIC extension header structures that only
687+
became available in v1.1.0.
688+
689+
All that being said, the compatibility code with libfabric
690+
v1.0.0 in the usNIC BTL has been retained, for two reasons:
689691
690692
1. It's not harmful, nor overly complicated. So the
691693
compatibility code was not ripped out.
@@ -695,19 +697,40 @@ static mca_btl_base_module_t** usnic_component_init(int* num_btl_modules,
695697
Someday, #2 may no longer be true, and we may therefore rip out
696698
the libfabric v1.0.0 compatibility code. */
697699

698-
/* First try API version 1.4. If that doesn't work, try API
699-
version 1.1. */
700+
/* First, check to see if the libfabric we are running with is <=
701+
libfabric v1.3. If so, don't bother going further. */
700702
uint32_t libfabric_api;
701-
libfabric_api = FI_VERSION(1, 4);
702-
ret = do_fi_getinfo(libfabric_api, &info_list);
703-
// Libfabric core will return -FI_ENOSYS if it is too old
704-
if (-FI_ENOSYS == ret) {
705-
libfabric_api = FI_VERSION(1, 1);
706-
ret = do_fi_getinfo(libfabric_api, &info_list);
703+
libfabric_api = fi_version();
704+
if (libfabric_api < FI_VERSION(1, 3)) {
705+
opal_output_verbose(5, USNIC_OUT,
706+
"btl:usnic: disqualifiying myself because Libfabric does not support v1.3 of the API (v1.3 is *required* for correct usNIC functionality).");
707+
return NULL;
708+
}
709+
710+
/* Libfabric API 1.3 is fine. Above that, we know that Open MPI
711+
works with libfabric API v1.4, so just use that. */
712+
if (libfabric_api > FI_VERSION(1, 3)) {
713+
libfabric_api = FI_VERSION(1, 4);
707714
}
715+
716+
struct fi_info hints = {0};
717+
struct fi_ep_attr ep_attr = {0};
718+
struct fi_fabric_attr fabric_attr = {0};
719+
720+
/* We only want providers named "usnic" that are of type EP_DGRAM */
721+
fabric_attr.prov_name = "usnic";
722+
ep_attr.type = FI_EP_DGRAM;
723+
724+
hints.caps = FI_MSG;
725+
hints.mode = FI_LOCAL_MR | FI_MSG_PREFIX;
726+
hints.addr_format = FI_SOCKADDR;
727+
hints.ep_attr = &ep_attr;
728+
hints.fabric_attr = &fabric_attr;
729+
730+
ret = fi_getinfo(libfabric_api, NULL, 0, 0, &hints, &info_list);
708731
if (0 != ret) {
709732
opal_output_verbose(5, USNIC_OUT,
710-
"btl:usnic: disqualifiying myself due to fi_getinfo failure: %s (%d)", strerror(-ret), ret);
733+
"btl:usnic: disqualifiying myself due to fi_getinfo(3) failure: %s (%d)", strerror(-ret), ret);
711734
return NULL;
712735
}
713736

0 commit comments

Comments
 (0)