Update hwloc for MSVC.

This commit is contained in:
XMRig 2024-10-21 08:31:52 +07:00
parent e32731b60b
commit 8a4792f638
No known key found for this signature in database
GPG key ID: 446A53638BE94409
25 changed files with 875 additions and 329 deletions

View file

@ -1,5 +1,5 @@
Copyright © 2009 CNRS Copyright © 2009 CNRS
Copyright © 2009-2023 Inria. All rights reserved. Copyright © 2009-2024 Inria. All rights reserved.
Copyright © 2009-2013 Université Bordeaux Copyright © 2009-2013 Université Bordeaux
Copyright © 2009-2011 Cisco Systems, Inc. All rights reserved. Copyright © 2009-2011 Cisco Systems, Inc. All rights reserved.
Copyright © 2020 Hewlett Packard Enterprise. All rights reserved. Copyright © 2020 Hewlett Packard Enterprise. All rights reserved.
@ -17,6 +17,71 @@ bug fixes (and other actions) for each version of hwloc since version
0.9. 0.9.
Version 2.11.2
--------------
* Add missing CPU info attrs on aarch64 on Linux.
* Use ACPI CPPC on Linux to get better information about cpukinds,
at least on AMD CPUs.
* Fix crash when manipulating cpukinds after topology
duplication, thanks to Hadrien Grasland for the report.
* Fix missing input target checks in memattr functions,
thanks to Hadrien Grasland for the report.
* Fix a memory leak when ignoring NUMA distances on FreeBSD.
* Fix build failure on old Linux distributions without accessat().
* Fix non-Windows importing of XML topologies and CPUID dumps exported
on Windows.
* hwloc-calc --cpuset-output-format systemd-dbus-api now allows
to generate AllowedCPUs information for systemd slices.
See the hwloc-calc manpage for examples. Thanks to Pierre Neyron.
* Some fixes in manpage EXAMPLES and split them into subsections.
Version 2.11.1
--------------
* Fix bash completions, thanks Tavis Rudd.
Version 2.11.0
--------------
* API
+ Add HWLOC_MEMBIND_WEIGHTED_INTERLEAVE memory binding policy on
Linux 6.9+. Thanks to Honggyu Kim for the patch.
- weighted_interleave_membind is added to membind support bits.
- The "weighted" policy is added to the hwloc-bind tool.
+ Add hwloc_obj_set_subtype(). Thanks to Hadrien Grasland for the report.
* GPU support
+ Don't hide the GPU NUMA node on NVIDIA Grace Hopper.
+ Get Intel GPU OpenCL device locality.
+ Add bandwidths between subdevices in the LevelZero XeLinkBandwidth
matrix.
+ Fix PCI Gen4+ link speed of NVIDIA GPU obtained from NVML,
thanks to Akram Sbaih for the report.
* Windows support
+ Fix Windows support when UNICODE is enabled, several hwloc features
were missing, thanks to Martin for the report.
+ Fix the enabling of CUDA in Windows CMake build,
Thanks to Moritz Kreutzer for the patch.
+ Fix CUDA/OpenCL test source path in Windows CMake.
* Tools
+ Option --best-memattr may now return multiple nodes. Additional
configuration flags may be given to tweak its behavior.
+ hwloc-info has a new --get-attr option to get a single attribute.
+ hwloc-info now supports "levels", "support" and "topology"
special keywords for backward compatibility for hwloc 3.0.
+ The --taskset command-line option is superseded by the new
--cpuset-output-format which also allows to export as list.
+ hwloc-calc may now import bitmasks described as a list of bits
with the new "--cpuset-input-format list".
* Misc
+ The MemoryTiersNr info attribute in the root object now says how many
memory tiers were built. Thanks to Antoine Morvan for the report.
+ Fix the management of infinite cpusets in the bitmap printf/sscanf
API as well as in command-line tools.
+ Add section "Compiling software on top of hwloc's C API" in the
documentation with examples for GNU Make and CMake,
thanks to Florent Pruvost for the help.
Version 2.10.0 Version 2.10.0
-------------- --------------
* Heterogeneous Memory core improvements * Heterogeneous Memory core improvements

View file

@ -418,14 +418,8 @@ return 0;
} }
hwloc provides a pkg-config executable to obtain relevant compiler and linker hwloc provides a pkg-config executable to obtain relevant compiler and linker
flags. For example, it can be used thusly to compile applications that utilize flags. See Compiling software on top of hwloc's C API for details on building
the hwloc library (assuming GNU Make): program on top of hwloc's API using GNU Make or CMake.
CFLAGS += $(shell pkg-config --cflags hwloc)
LDLIBS += $(shell pkg-config --libs hwloc)
hwloc-hello: hwloc-hello.c
$(CC) hwloc-hello.c $(CFLAGS) -o hwloc-hello $(LDLIBS)
On a machine 2 processor packages -- each package of which has two processing On a machine 2 processor packages -- each package of which has two processing
cores -- the output from running hwloc-hello could be something like the cores -- the output from running hwloc-hello could be something like the

View file

@ -8,8 +8,8 @@
# Please update HWLOC_VERSION* in contrib/windows/hwloc_config.h too. # Please update HWLOC_VERSION* in contrib/windows/hwloc_config.h too.
major=2 major=2
minor=10 minor=11
release=0 release=2
# greek is used for alpha or beta release tags. If it is non-empty, # greek is used for alpha or beta release tags. If it is non-empty,
# it will be appended to the version number. It does not have to be # it will be appended to the version number. It does not have to be
@ -22,7 +22,7 @@ greek=
# The date when this release was created # The date when this release was created
date="Dec 04, 2023" date="Sep 26, 2024"
# If snapshot=1, then use the value from snapshot_version as the # If snapshot=1, then use the value from snapshot_version as the
# entire hwloc version (i.e., ignore major, minor, release, and # entire hwloc version (i.e., ignore major, minor, release, and
@ -41,6 +41,6 @@ snapshot_version=${major}.${minor}.${release}${greek}-git
# 2. Version numbers are described in the Libtool current:revision:age # 2. Version numbers are described in the Libtool current:revision:age
# format. # format.
libhwloc_so_version=22:0:7 libhwloc_so_version=23:1:8
# Please also update the <TargetName> lines in contrib/windows/libhwloc.vcxproj # Please also update the <TargetName> lines in contrib/windows/libhwloc.vcxproj

File diff suppressed because it is too large Load diff

View file

@ -1,6 +1,6 @@
/* /*
* Copyright © 2009 CNRS * Copyright © 2009 CNRS
* Copyright © 2009-2023 Inria. All rights reserved. * Copyright © 2009-2024 Inria. All rights reserved.
* Copyright © 2009-2012 Université Bordeaux * Copyright © 2009-2012 Université Bordeaux
* Copyright © 2009-2011 Cisco Systems, Inc. All rights reserved. * Copyright © 2009-2011 Cisco Systems, Inc. All rights reserved.
* See COPYING in top-level directory. * See COPYING in top-level directory.
@ -11,10 +11,10 @@
#ifndef HWLOC_CONFIG_H #ifndef HWLOC_CONFIG_H
#define HWLOC_CONFIG_H #define HWLOC_CONFIG_H
#define HWLOC_VERSION "2.10.0" #define HWLOC_VERSION "2.11.2"
#define HWLOC_VERSION_MAJOR 2 #define HWLOC_VERSION_MAJOR 2
#define HWLOC_VERSION_MINOR 10 #define HWLOC_VERSION_MINOR 11
#define HWLOC_VERSION_RELEASE 0 #define HWLOC_VERSION_RELEASE 2
#define HWLOC_VERSION_GREEK "" #define HWLOC_VERSION_GREEK ""
#define __hwloc_restrict #define __hwloc_restrict

View file

@ -1,5 +1,5 @@
/* /*
* Copyright © 2010-2023 Inria. All rights reserved. * Copyright © 2010-2024 Inria. All rights reserved.
* See COPYING in top-level directory. * See COPYING in top-level directory.
*/ */
@ -28,18 +28,18 @@ extern "C" {
/** \brief Matrix of distances between a set of objects. /** \brief Matrix of distances between a set of objects.
* *
* This matrix often contains latencies between NUMA nodes * The most common matrix contains latencies between NUMA nodes
* (as reported in the System Locality Distance Information Table (SLIT) * (as reported in the System Locality Distance Information Table (SLIT)
* in the ACPI specification), which may or may not be physically accurate. * in the ACPI specification), which may or may not be physically accurate.
* It corresponds to the latency for accessing the memory of one node * It corresponds to the latency for accessing the memory of one node
* from a core in another node. * from a core in another node.
* The corresponding kind is ::HWLOC_DISTANCES_KIND_FROM_OS | ::HWLOC_DISTANCES_KIND_FROM_USER. * The corresponding kind is ::HWLOC_DISTANCES_KIND_MEANS_LATENCY | ::HWLOC_DISTANCES_KIND_FROM_USER.
* The name of this distances structure is "NUMALatency". * The name of this distances structure is "NUMALatency".
* Others distance structures include and "XGMIBandwidth", "XGMIHops",
* "XeLinkBandwidth" and "NVLinkBandwidth".
* *
* The matrix may also contain bandwidths between random sets of objects, * The matrix may also contain bandwidths between random sets of objects,
* possibly provided by the user, as specified in the \p kind attribute. * possibly provided by the user, as specified in the \p kind attribute.
* Others common distance structures include and "XGMIBandwidth", "XGMIHops",
* "XeLinkBandwidth" and "NVLinkBandwidth".
* *
* Pointers \p objs and \p values should not be replaced, reallocated, freed, etc. * Pointers \p objs and \p values should not be replaced, reallocated, freed, etc.
* However callers are allowed to modify \p kind as well as the contents * However callers are allowed to modify \p kind as well as the contents
@ -70,11 +70,10 @@ struct hwloc_distances_s {
* The \p kind attribute of struct hwloc_distances_s is a OR'ed set * The \p kind attribute of struct hwloc_distances_s is a OR'ed set
* of kinds. * of kinds.
* *
* A kind of format HWLOC_DISTANCES_KIND_FROM_* specifies where the * Each distance matrix may have only one kind among HWLOC_DISTANCES_KIND_FROM_*
* distance information comes from, if known. * specifying where distance information comes from,
* * and one kind among HWLOC_DISTANCES_KIND_MEANS_* specifying
* A kind of format HWLOC_DISTANCES_KIND_MEANS_* specifies whether * whether values are latencies or bandwidths.
* values are latencies or bandwidths, if applicable.
*/ */
enum hwloc_distances_kind_e { enum hwloc_distances_kind_e {
/** \brief These distances were obtained from the operating system or hardware. /** \brief These distances were obtained from the operating system or hardware.
@ -357,6 +356,8 @@ typedef void * hwloc_distances_add_handle_t;
* Otherwise, it will be copied internally and may later be freed by the caller. * Otherwise, it will be copied internally and may later be freed by the caller.
* *
* \p kind specifies the kind of distance as a OR'ed set of ::hwloc_distances_kind_e. * \p kind specifies the kind of distance as a OR'ed set of ::hwloc_distances_kind_e.
* Only one kind of meaning and one kind of provenance may be given if appropriate
* (e.g. ::HWLOC_DISTANCES_KIND_MEANS_BANDWIDTH and ::HWLOC_DISTANCES_KIND_FROM_USER).
* Kind ::HWLOC_DISTANCES_KIND_HETEROGENEOUS_TYPES will be automatically set * Kind ::HWLOC_DISTANCES_KIND_HETEROGENEOUS_TYPES will be automatically set
* according to objects having different types in hwloc_distances_add_values(). * according to objects having different types in hwloc_distances_add_values().
* *
@ -403,7 +404,8 @@ HWLOC_DECLSPEC int hwloc_distances_add_values(hwloc_topology_t topology,
/** \brief Flags for adding a new distances to a topology. */ /** \brief Flags for adding a new distances to a topology. */
enum hwloc_distances_add_flag_e { enum hwloc_distances_add_flag_e {
/** \brief Try to group objects based on the newly provided distance information. /** \brief Try to group objects based on the newly provided distance information.
* This is ignored for distances between objects of different types. * Grouping is only performed when the distances structure contains latencies,
* and when all objects are of the same type.
* \hideinitializer * \hideinitializer
*/ */
HWLOC_DISTANCES_ADD_FLAG_GROUP = (1UL<<0), HWLOC_DISTANCES_ADD_FLAG_GROUP = (1UL<<0),

View file

@ -1,6 +1,6 @@
/* /*
* Copyright © 2009 CNRS * Copyright © 2009 CNRS
* Copyright © 2009-2023 Inria. All rights reserved. * Copyright © 2009-2024 Inria. All rights reserved.
* Copyright © 2009-2012 Université Bordeaux * Copyright © 2009-2012 Université Bordeaux
* Copyright © 2009-2010 Cisco Systems, Inc. All rights reserved. * Copyright © 2009-2010 Cisco Systems, Inc. All rights reserved.
* See COPYING in top-level directory. * See COPYING in top-level directory.
@ -946,6 +946,14 @@ enum hwloc_distrib_flags_e {
* *
* \return 0 on success, -1 on error. * \return 0 on success, -1 on error.
* *
* \note On hybrid CPUs (or asymmetric platforms), distribution may be suboptimal
* since the number of cores or PUs inside packages or below caches may vary
* (the top-down recursive partitioning ignores these numbers until reaching their levels).
* Hence it is recommended to distribute only inside a single homogeneous domain.
* For instance on a CPU with energy-efficient E-cores and high-performance P-cores,
* one should distribute separately N tasks on E-cores and M tasks on P-cores
* instead of trying to distribute directly M+N tasks on the entire CPUs.
*
* \note This function requires the \p roots objects to have a CPU set. * \note This function requires the \p roots objects to have a CPU set.
*/ */
static __hwloc_inline int static __hwloc_inline int
@ -960,7 +968,7 @@ hwloc_distrib(hwloc_topology_t topology,
unsigned given, givenweight; unsigned given, givenweight;
hwloc_cpuset_t *cpusetp = set; hwloc_cpuset_t *cpusetp = set;
if (flags & ~HWLOC_DISTRIB_FLAG_REVERSE) { if (!n || (flags & ~HWLOC_DISTRIB_FLAG_REVERSE)) {
errno = EINVAL; errno = EINVAL;
return -1; return -1;
} }

View file

@ -1,5 +1,5 @@
/* /*
* Copyright © 2019-2023 Inria. All rights reserved. * Copyright © 2019-2024 Inria. All rights reserved.
* See COPYING in top-level directory. * See COPYING in top-level directory.
*/ */
@ -69,7 +69,10 @@ extern "C" {
* @{ * @{
*/ */
/** \brief Memory node attributes. */ /** \brief Predefined memory attribute IDs.
* See ::hwloc_memattr_id_t for the generic definition of IDs
* for predefined or custom attributes.
*/
enum hwloc_memattr_id_e { enum hwloc_memattr_id_e {
/** \brief /** \brief
* The \"Capacity\" is returned in bytes (local_memory attribute in objects). * The \"Capacity\" is returned in bytes (local_memory attribute in objects).
@ -78,6 +81,8 @@ enum hwloc_memattr_id_e {
* *
* No initiator is involved when looking at this attribute. * No initiator is involved when looking at this attribute.
* The corresponding attribute flags are ::HWLOC_MEMATTR_FLAG_HIGHER_FIRST. * The corresponding attribute flags are ::HWLOC_MEMATTR_FLAG_HIGHER_FIRST.
*
* Capacity values may not be modified using hwloc_memattr_set_value().
* \hideinitializer * \hideinitializer
*/ */
HWLOC_MEMATTR_ID_CAPACITY = 0, HWLOC_MEMATTR_ID_CAPACITY = 0,
@ -93,6 +98,8 @@ enum hwloc_memattr_id_e {
* *
* No initiator is involved when looking at this attribute. * No initiator is involved when looking at this attribute.
* The corresponding attribute flags are ::HWLOC_MEMATTR_FLAG_HIGHER_FIRST. * The corresponding attribute flags are ::HWLOC_MEMATTR_FLAG_HIGHER_FIRST.
* Locality values may not be modified using hwloc_memattr_set_value().
* \hideinitializer * \hideinitializer
*/ */
HWLOC_MEMATTR_ID_LOCALITY = 1, HWLOC_MEMATTR_ID_LOCALITY = 1,
@ -173,11 +180,19 @@ enum hwloc_memattr_id_e {
/* TODO persistence? */ /* TODO persistence? */
HWLOC_MEMATTR_ID_MAX /**< \private Sentinel value */ HWLOC_MEMATTR_ID_MAX /**< \private
* Sentinel value for predefined attributes.
* Dynamically registered custom attributes start here.
*/
}; };
/** \brief A memory attribute identifier. /** \brief A memory attribute identifier.
* May be either one of ::hwloc_memattr_id_e or a new id returned by hwloc_memattr_register(). *
* hwloc predefines some commonly-used attributes in ::hwloc_memattr_id_e.
* One may then dynamically register custom ones with hwloc_memattr_register(),
* they will be assigned IDs immediately after the predefined ones.
* See \ref hwlocality_memattrs_manage for more information about
* existing attribute IDs.
*/ */
typedef unsigned hwloc_memattr_id_t; typedef unsigned hwloc_memattr_id_t;
@ -283,6 +298,10 @@ hwloc_get_local_numanode_objs(hwloc_topology_t topology,
* (it does not have the flag ::HWLOC_MEMATTR_FLAG_NEED_INITIATOR), * (it does not have the flag ::HWLOC_MEMATTR_FLAG_NEED_INITIATOR),
* location \p initiator is ignored and may be \c NULL. * location \p initiator is ignored and may be \c NULL.
* *
* \p target_node cannot be \c NULL. If \p attribute is ::HWLOC_MEMATTR_ID_CAPACITY,
* \p target_node must be a NUMA node. If it is ::HWLOC_MEMATTR_ID_LOCALITY,
* \p target_node must have a CPU set.
*
* \p flags must be \c 0 for now. * \p flags must be \c 0 for now.
* *
* \return 0 on success. * \return 0 on success.
@ -352,6 +371,8 @@ hwloc_memattr_get_best_target(hwloc_topology_t topology,
* The returned initiator should not be modified or freed, * The returned initiator should not be modified or freed,
* it belongs to the topology. * it belongs to the topology.
* *
* \p target_node cannot be \c NULL.
*
* \p flags must be \c 0 for now. * \p flags must be \c 0 for now.
* *
* \return 0 on success. * \return 0 on success.
@ -362,100 +383,10 @@ hwloc_memattr_get_best_target(hwloc_topology_t topology,
HWLOC_DECLSPEC int HWLOC_DECLSPEC int
hwloc_memattr_get_best_initiator(hwloc_topology_t topology, hwloc_memattr_get_best_initiator(hwloc_topology_t topology,
hwloc_memattr_id_t attribute, hwloc_memattr_id_t attribute,
hwloc_obj_t target, hwloc_obj_t target_node,
unsigned long flags, unsigned long flags,
struct hwloc_location *best_initiator, hwloc_uint64_t *value); struct hwloc_location *best_initiator, hwloc_uint64_t *value);
/** @} */
/** \defgroup hwlocality_memattrs_manage Managing memory attributes
* @{
*/
/** \brief Return the name of a memory attribute.
*
* \return 0 on success.
* \return -1 with errno set to \c EINVAL if the attribute does not exist.
*/
HWLOC_DECLSPEC int
hwloc_memattr_get_name(hwloc_topology_t topology,
hwloc_memattr_id_t attribute,
const char **name);
/** \brief Return the flags of the given attribute.
*
* Flags are a OR'ed set of ::hwloc_memattr_flag_e.
*
* \return 0 on success.
* \return -1 with errno set to \c EINVAL if the attribute does not exist.
*/
HWLOC_DECLSPEC int
hwloc_memattr_get_flags(hwloc_topology_t topology,
hwloc_memattr_id_t attribute,
unsigned long *flags);
/** \brief Memory attribute flags.
* Given to hwloc_memattr_register() and returned by hwloc_memattr_get_flags().
*/
enum hwloc_memattr_flag_e {
/** \brief The best nodes for this memory attribute are those with the higher values.
* For instance Bandwidth.
*/
HWLOC_MEMATTR_FLAG_HIGHER_FIRST = (1UL<<0),
/** \brief The best nodes for this memory attribute are those with the lower values.
* For instance Latency.
*/
HWLOC_MEMATTR_FLAG_LOWER_FIRST = (1UL<<1),
/** \brief The value returned for this memory attribute depends on the given initiator.
* For instance Bandwidth and Latency, but not Capacity.
*/
HWLOC_MEMATTR_FLAG_NEED_INITIATOR = (1UL<<2)
};
/** \brief Register a new memory attribute.
*
* Add a specific memory attribute that is not defined in ::hwloc_memattr_id_e.
* Flags are a OR'ed set of ::hwloc_memattr_flag_e. It must contain at least
* one of ::HWLOC_MEMATTR_FLAG_HIGHER_FIRST or ::HWLOC_MEMATTR_FLAG_LOWER_FIRST.
*
* \return 0 on success.
* \return -1 with errno set to \c EBUSY if another attribute already uses this name.
*/
HWLOC_DECLSPEC int
hwloc_memattr_register(hwloc_topology_t topology,
const char *name,
unsigned long flags,
hwloc_memattr_id_t *id);
/** \brief Set an attribute value for a specific target NUMA node.
*
* If the attribute does not relate to a specific initiator
* (it does not have the flag ::HWLOC_MEMATTR_FLAG_NEED_INITIATOR),
* location \p initiator is ignored and may be \c NULL.
*
* The initiator will be copied into the topology,
* the caller should free anything allocated to store the initiator,
* for instance the cpuset.
*
* \p flags must be \c 0 for now.
*
* \note The initiator \p initiator should be of type ::HWLOC_LOCATION_TYPE_CPUSET
* when referring to accesses performed by CPU cores.
* ::HWLOC_LOCATION_TYPE_OBJECT is currently unused internally by hwloc,
* but users may for instance use it to provide custom information about
* host memory accesses performed by GPUs.
*
* \return 0 on success or -1 on error.
*/
HWLOC_DECLSPEC int
hwloc_memattr_set_value(hwloc_topology_t topology,
hwloc_memattr_id_t attribute,
hwloc_obj_t target_node,
struct hwloc_location *initiator,
unsigned long flags,
hwloc_uint64_t value);
/** \brief Return the target NUMA nodes that have some values for a given attribute. /** \brief Return the target NUMA nodes that have some values for a given attribute.
* *
* Return targets for the given attribute in the \p targets array * Return targets for the given attribute in the \p targets array
@ -519,6 +450,8 @@ hwloc_memattr_get_targets(hwloc_topology_t topology,
* The returned initiators should not be modified or freed, * The returned initiators should not be modified or freed,
* they belong to the topology. * they belong to the topology.
* *
* \p target_node cannot be \c NULL.
*
* \p flags must be \c 0 for now. * \p flags must be \c 0 for now.
* *
* If the attribute does not relate to a specific initiator * If the attribute does not relate to a specific initiator
@ -538,6 +471,131 @@ hwloc_memattr_get_initiators(hwloc_topology_t topology,
hwloc_obj_t target_node, hwloc_obj_t target_node,
unsigned long flags, unsigned long flags,
unsigned *nr, struct hwloc_location *initiators, hwloc_uint64_t *values); unsigned *nr, struct hwloc_location *initiators, hwloc_uint64_t *values);
/** @} */
/** \defgroup hwlocality_memattrs_manage Managing memory attributes
*
* Memory attribues are identified by an ID (::hwloc_memattr_id_t)
* and a name. hwloc_memattr_get_name() and hwloc_memattr_get_by_name()
* convert between them (or return error if the attribute does not exist).
*
* The set of valid ::hwloc_memattr_id_t is a contigous set starting at \c 0.
* It first contains predefined attributes, as listed
* in ::hwloc_memattr_id_e (from \c 0 to \c HWLOC_MEMATTR_ID_MAX-1).
* Then custom attributes may be dynamically registered with
* hwloc_memattr_register(). They will get the following IDs
* (\c HWLOC_MEMATTR_ID_MAX for the first one, etc.).
*
* To iterate over all valid attributes
* (either predefined or dynamically registered custom ones),
* one may iterate over IDs starting from \c 0 until hwloc_memattr_get_name()
* or hwloc_memattr_get_flags() returns an error.
*
* The values for an existing attribute or for custom dynamically registered ones
* may be set or modified with hwloc_memattr_set_value().
*
* @{
*/
/** \brief Return the name of a memory attribute.
*
* The output pointer \p name cannot be \c NULL.
*
* \return 0 on success.
* \return -1 with errno set to \c EINVAL if the attribute does not exist.
*/
HWLOC_DECLSPEC int
hwloc_memattr_get_name(hwloc_topology_t topology,
hwloc_memattr_id_t attribute,
const char **name);
/** \brief Return the flags of the given attribute.
*
* Flags are a OR'ed set of ::hwloc_memattr_flag_e.
*
* The output pointer \p flags cannot be \c NULL.
*
* \return 0 on success.
* \return -1 with errno set to \c EINVAL if the attribute does not exist.
*/
HWLOC_DECLSPEC int
hwloc_memattr_get_flags(hwloc_topology_t topology,
hwloc_memattr_id_t attribute,
unsigned long *flags);
/** \brief Memory attribute flags.
* Given to hwloc_memattr_register() and returned by hwloc_memattr_get_flags().
*/
enum hwloc_memattr_flag_e {
/** \brief The best nodes for this memory attribute are those with the higher values.
* For instance Bandwidth.
*/
HWLOC_MEMATTR_FLAG_HIGHER_FIRST = (1UL<<0),
/** \brief The best nodes for this memory attribute are those with the lower values.
* For instance Latency.
*/
HWLOC_MEMATTR_FLAG_LOWER_FIRST = (1UL<<1),
/** \brief The value returned for this memory attribute depends on the given initiator.
* For instance Bandwidth and Latency, but not Capacity.
*/
HWLOC_MEMATTR_FLAG_NEED_INITIATOR = (1UL<<2)
};
/** \brief Register a new memory attribute.
*
* Add a new custom memory attribute.
* Flags are a OR'ed set of ::hwloc_memattr_flag_e. It must contain one of
* ::HWLOC_MEMATTR_FLAG_HIGHER_FIRST or ::HWLOC_MEMATTR_FLAG_LOWER_FIRST but not both.
*
* The new attribute \p id is immediately after the last existing attribute ID
* (which is either the ID of the last registered attribute if any,
* or the ID of the last predefined attribute in ::hwloc_memattr_id_e).
*
* \return 0 on success.
* \return -1 with errno set to \c EINVAL if an invalid set of flags is given.
* \return -1 with errno set to \c EBUSY if another attribute already uses this name.
*/
HWLOC_DECLSPEC int
hwloc_memattr_register(hwloc_topology_t topology,
const char *name,
unsigned long flags,
hwloc_memattr_id_t *id);
/** \brief Set an attribute value for a specific target NUMA node.
*
* If the attribute does not relate to a specific initiator
* (it does not have the flag ::HWLOC_MEMATTR_FLAG_NEED_INITIATOR),
* location \p initiator is ignored and may be \c NULL.
*
* The initiator will be copied into the topology,
* the caller should free anything allocated to store the initiator,
* for instance the cpuset.
*
* \p target_node cannot be \c NULL.
*
* \p attribute cannot be ::HWLOC_MEMATTR_FLAG_ID_CAPACITY or
* ::HWLOC_MEMATTR_FLAG_ID_LOCALITY.
*
* \p flags must be \c 0 for now.
*
* \note The initiator \p initiator should be of type ::HWLOC_LOCATION_TYPE_CPUSET
* when referring to accesses performed by CPU cores.
* ::HWLOC_LOCATION_TYPE_OBJECT is currently unused internally by hwloc,
* but users may for instance use it to provide custom information about
* host memory accesses performed by GPUs.
*
* \return 0 on success or -1 on error.
*/
HWLOC_DECLSPEC int
hwloc_memattr_set_value(hwloc_topology_t topology,
hwloc_memattr_id_t attribute,
hwloc_obj_t target_node,
struct hwloc_location *initiator,
unsigned long flags,
hwloc_uint64_t value);
/** @} */ /** @} */
#ifdef __cplusplus #ifdef __cplusplus

View file

@ -41,6 +41,15 @@ extern "C" {
*/ */
/* Copyright (c) 2008-2018 The Khronos Group Inc. */ /* Copyright (c) 2008-2018 The Khronos Group Inc. */
/* needs "cl_khr_pci_bus_info" device extension, but not strictly required for clGetDeviceInfo() */
typedef struct {
cl_uint pci_domain;
cl_uint pci_bus;
cl_uint pci_device;
cl_uint pci_function;
} hwloc_cl_device_pci_bus_info_khr;
#define HWLOC_CL_DEVICE_PCI_BUS_INFO_KHR 0x410F
/* needs "cl_amd_device_attribute_query" device extension, but not strictly required for clGetDeviceInfo() */ /* needs "cl_amd_device_attribute_query" device extension, but not strictly required for clGetDeviceInfo() */
#define HWLOC_CL_DEVICE_TOPOLOGY_AMD 0x4037 #define HWLOC_CL_DEVICE_TOPOLOGY_AMD 0x4037
typedef union { typedef union {
@ -78,9 +87,19 @@ hwloc_opencl_get_device_pci_busid(cl_device_id device,
unsigned *domain, unsigned *bus, unsigned *dev, unsigned *func) unsigned *domain, unsigned *bus, unsigned *dev, unsigned *func)
{ {
hwloc_cl_device_topology_amd amdtopo; hwloc_cl_device_topology_amd amdtopo;
hwloc_cl_device_pci_bus_info_khr khrbusinfo;
cl_uint nvbus, nvslot, nvdomain; cl_uint nvbus, nvslot, nvdomain;
cl_int clret; cl_int clret;
clret = clGetDeviceInfo(device, HWLOC_CL_DEVICE_PCI_BUS_INFO_KHR, sizeof(khrbusinfo), &khrbusinfo, NULL);
if (CL_SUCCESS == clret) {
*domain = (unsigned) khrbusinfo.pci_domain;
*bus = (unsigned) khrbusinfo.pci_bus;
*dev = (unsigned) khrbusinfo.pci_device;
*func = (unsigned) khrbusinfo.pci_function;
return 0;
}
clret = clGetDeviceInfo(device, HWLOC_CL_DEVICE_TOPOLOGY_AMD, sizeof(amdtopo), &amdtopo, NULL); clret = clGetDeviceInfo(device, HWLOC_CL_DEVICE_TOPOLOGY_AMD, sizeof(amdtopo), &amdtopo, NULL);
if (CL_SUCCESS == clret if (CL_SUCCESS == clret
&& HWLOC_CL_DEVICE_TOPOLOGY_TYPE_PCIE_AMD == amdtopo.raw.type) { && HWLOC_CL_DEVICE_TOPOLOGY_TYPE_PCIE_AMD == amdtopo.raw.type) {

View file

@ -1,5 +1,5 @@
/* /*
* Copyright © 2013-2022 Inria. All rights reserved. * Copyright © 2013-2024 Inria. All rights reserved.
* Copyright © 2016 Cisco Systems, Inc. All rights reserved. * Copyright © 2016 Cisco Systems, Inc. All rights reserved.
* See COPYING in top-level directory. * See COPYING in top-level directory.
*/ */
@ -645,6 +645,19 @@ HWLOC_DECLSPEC struct hwloc_obj * hwloc_pci_find_parent_by_busid(struct hwloc_to
*/ */
HWLOC_DECLSPEC struct hwloc_obj * hwloc_pci_find_by_busid(struct hwloc_topology *topology, unsigned domain, unsigned bus, unsigned dev, unsigned func); HWLOC_DECLSPEC struct hwloc_obj * hwloc_pci_find_by_busid(struct hwloc_topology *topology, unsigned domain, unsigned bus, unsigned dev, unsigned func);
/** @} */
/** \defgroup hwlocality_components_distances Components and Plugins: distances
*
* \note These structures and functions may change when ::HWLOC_COMPONENT_ABI is modified.
*
* @{
*/
/** \brief Handle to a new distances structure during its addition to the topology. */ /** \brief Handle to a new distances structure during its addition to the topology. */
typedef void * hwloc_backend_distances_add_handle_t; typedef void * hwloc_backend_distances_add_handle_t;

View file

@ -1,6 +1,6 @@
/* /*
* Copyright © 2009-2011 Cisco Systems, Inc. All rights reserved. * Copyright © 2009-2011 Cisco Systems, Inc. All rights reserved.
* Copyright © 2010-2022 Inria. All rights reserved. * Copyright © 2010-2024 Inria. All rights reserved.
* See COPYING in top-level directory. * See COPYING in top-level directory.
*/ */
@ -210,6 +210,7 @@ extern "C" {
#define hwloc_obj_get_info_by_name HWLOC_NAME(obj_get_info_by_name) #define hwloc_obj_get_info_by_name HWLOC_NAME(obj_get_info_by_name)
#define hwloc_obj_add_info HWLOC_NAME(obj_add_info) #define hwloc_obj_add_info HWLOC_NAME(obj_add_info)
#define hwloc_obj_set_subtype HWLOC_NAME(obj_set_subtype)
#define HWLOC_CPUBIND_PROCESS HWLOC_NAME_CAPS(CPUBIND_PROCESS) #define HWLOC_CPUBIND_PROCESS HWLOC_NAME_CAPS(CPUBIND_PROCESS)
#define HWLOC_CPUBIND_THREAD HWLOC_NAME_CAPS(CPUBIND_THREAD) #define HWLOC_CPUBIND_THREAD HWLOC_NAME_CAPS(CPUBIND_THREAD)
@ -232,6 +233,7 @@ extern "C" {
#define HWLOC_MEMBIND_FIRSTTOUCH HWLOC_NAME_CAPS(MEMBIND_FIRSTTOUCH) #define HWLOC_MEMBIND_FIRSTTOUCH HWLOC_NAME_CAPS(MEMBIND_FIRSTTOUCH)
#define HWLOC_MEMBIND_BIND HWLOC_NAME_CAPS(MEMBIND_BIND) #define HWLOC_MEMBIND_BIND HWLOC_NAME_CAPS(MEMBIND_BIND)
#define HWLOC_MEMBIND_INTERLEAVE HWLOC_NAME_CAPS(MEMBIND_INTERLEAVE) #define HWLOC_MEMBIND_INTERLEAVE HWLOC_NAME_CAPS(MEMBIND_INTERLEAVE)
#define HWLOC_MEMBIND_WEIGHTED_INTERLEAVE HWLOC_NAME_CAPS(MEMBIND_WEIGHTED_INTERLEAVE)
#define HWLOC_MEMBIND_NEXTTOUCH HWLOC_NAME_CAPS(MEMBIND_NEXTTOUCH) #define HWLOC_MEMBIND_NEXTTOUCH HWLOC_NAME_CAPS(MEMBIND_NEXTTOUCH)
#define HWLOC_MEMBIND_MIXED HWLOC_NAME_CAPS(MEMBIND_MIXED) #define HWLOC_MEMBIND_MIXED HWLOC_NAME_CAPS(MEMBIND_MIXED)
@ -560,6 +562,7 @@ extern "C" {
/* opencl.h */ /* opencl.h */
#define hwloc_cl_device_pci_bus_info_khr HWLOC_NAME(cl_device_pci_bus_info_khr)
#define hwloc_cl_device_topology_amd HWLOC_NAME(cl_device_topology_amd) #define hwloc_cl_device_topology_amd HWLOC_NAME(cl_device_topology_amd)
#define hwloc_opencl_get_device_pci_busid HWLOC_NAME(opencl_get_device_pci_ids) #define hwloc_opencl_get_device_pci_busid HWLOC_NAME(opencl_get_device_pci_ids)
#define hwloc_opencl_get_device_cpuset HWLOC_NAME(opencl_get_device_cpuset) #define hwloc_opencl_get_device_cpuset HWLOC_NAME(opencl_get_device_cpuset)
@ -715,6 +718,8 @@ extern "C" {
#define hwloc__obj_type_is_dcache HWLOC_NAME(_obj_type_is_dcache) #define hwloc__obj_type_is_dcache HWLOC_NAME(_obj_type_is_dcache)
#define hwloc__obj_type_is_icache HWLOC_NAME(_obj_type_is_icache) #define hwloc__obj_type_is_icache HWLOC_NAME(_obj_type_is_icache)
#define hwloc__pci_link_speed HWLOC_NAME(_pci_link_speed)
/* private/cpuid-x86.h */ /* private/cpuid-x86.h */
#define hwloc_have_x86_cpuid HWLOC_NAME(have_x86_cpuid) #define hwloc_have_x86_cpuid HWLOC_NAME(have_x86_cpuid)

View file

@ -1,6 +1,6 @@
/* /*
* Copyright © 2009, 2011, 2012 CNRS. All rights reserved. * Copyright © 2009, 2011, 2012 CNRS. All rights reserved.
* Copyright © 2009-2021 Inria. All rights reserved. * Copyright © 2009-2020 Inria. All rights reserved.
* Copyright © 2009, 2011, 2012, 2015 Université Bordeaux. All rights reserved. * Copyright © 2009, 2011, 2012, 2015 Université Bordeaux. All rights reserved.
* Copyright © 2009-2020 Cisco Systems, Inc. All rights reserved. * Copyright © 2009-2020 Cisco Systems, Inc. All rights reserved.
* $COPYRIGHT$ * $COPYRIGHT$
@ -17,6 +17,10 @@
#define HWLOC_HAVE_MSVC_CPUIDEX 1 #define HWLOC_HAVE_MSVC_CPUIDEX 1
/* #undef HAVE_MKSTEMP */
#define HWLOC_HAVE_X86_CPUID 1
/* Define to 1 if the system has the type `CACHE_DESCRIPTOR'. */ /* Define to 1 if the system has the type `CACHE_DESCRIPTOR'. */
#define HAVE_CACHE_DESCRIPTOR 0 #define HAVE_CACHE_DESCRIPTOR 0
@ -128,8 +132,7 @@
#define HAVE_DECL__SC_PAGE_SIZE 0 #define HAVE_DECL__SC_PAGE_SIZE 0
/* Define to 1 if you have the <dirent.h> header file. */ /* Define to 1 if you have the <dirent.h> header file. */
/* #define HAVE_DIRENT_H 1 */ /* #undef HAVE_DIRENT_H */
#undef HAVE_DIRENT_H
/* Define to 1 if you have the <dlfcn.h> header file. */ /* Define to 1 if you have the <dlfcn.h> header file. */
/* #undef HAVE_DLFCN_H */ /* #undef HAVE_DLFCN_H */
@ -282,7 +285,7 @@
#define HAVE_STRING_H 1 #define HAVE_STRING_H 1
/* Define to 1 if you have the `strncasecmp' function. */ /* Define to 1 if you have the `strncasecmp' function. */
#define HAVE_STRNCASECMP 1 /* #undef HAVE_STRNCASECMP */
/* Define to '1' if sysctl is present and usable */ /* Define to '1' if sysctl is present and usable */
/* #undef HAVE_SYSCTL */ /* #undef HAVE_SYSCTL */
@ -323,8 +326,7 @@
/* #undef HAVE_UNAME */ /* #undef HAVE_UNAME */
/* Define to 1 if you have the <unistd.h> header file. */ /* Define to 1 if you have the <unistd.h> header file. */
/* #define HAVE_UNISTD_H 1 */ /* #undef HAVE_UNISTD_H */
#undef HAVE_UNISTD_H
/* Define to 1 if you have the `uselocale' function. */ /* Define to 1 if you have the `uselocale' function. */
/* #undef HAVE_USELOCALE */ /* #undef HAVE_USELOCALE */
@ -659,7 +661,7 @@
#define hwloc_pid_t HANDLE #define hwloc_pid_t HANDLE
/* Define this to either strncasecmp or strncmp */ /* Define this to either strncasecmp or strncmp */
#define hwloc_strncasecmp strncasecmp /* #undef hwloc_strncasecmp */
/* Define this to the thread ID type */ /* Define this to the thread ID type */
#define hwloc_thread_t HANDLE #define hwloc_thread_t HANDLE

View file

@ -11,6 +11,22 @@
#ifndef HWLOC_PRIVATE_CPUID_X86_H #ifndef HWLOC_PRIVATE_CPUID_X86_H
#define HWLOC_PRIVATE_CPUID_X86_H #define HWLOC_PRIVATE_CPUID_X86_H
/* A macro for annotating memory as uninitialized when building with MSAN
* (and otherwise having no effect). See below for why this is used with
* our custom assembly.
*/
#ifdef __has_feature
#define HWLOC_HAS_FEATURE(name) __has_feature(name)
#else
#define HWLOC_HAS_FEATURE(name) 0
#endif
#if HWLOC_HAS_FEATURE(memory_sanitizer) || defined(MEMORY_SANITIZER)
#include <sanitizer/msan_interface.h>
#define HWLOC_ANNOTATE_MEMORY_IS_INITIALIZED(ptr, len) __msan_unpoison(ptr, len)
#else
#define HWLOC_ANNOTATE_MEMORY_IS_INITIALIZED(ptr, len)
#endif
#if (defined HWLOC_X86_32_ARCH) && (!defined HWLOC_HAVE_MSVC_CPUIDEX) #if (defined HWLOC_X86_32_ARCH) && (!defined HWLOC_HAVE_MSVC_CPUIDEX)
static __hwloc_inline int hwloc_have_x86_cpuid(void) static __hwloc_inline int hwloc_have_x86_cpuid(void)
{ {
@ -71,12 +87,18 @@ static __hwloc_inline void hwloc_x86_cpuid(unsigned *eax, unsigned *ebx, unsigne
"movl %k2,%1\n\t" "movl %k2,%1\n\t"
: "+a" (*eax), "=m" (*ebx), "=&r"(sav_rbx), : "+a" (*eax), "=m" (*ebx), "=&r"(sav_rbx),
"+c" (*ecx), "=&d" (*edx)); "+c" (*ecx), "=&d" (*edx));
/* MSAN does not recognize the effect of the above assembly on the memory operand
* (`"=m"(*ebx)`). This may get improved in MSAN at some point in the future, e.g.
* see https://github.com/llvm/llvm-project/pull/77393. */
HWLOC_ANNOTATE_MEMORY_IS_INITIALIZED(ebx, sizeof *ebx);
#elif defined(HWLOC_X86_32_ARCH) #elif defined(HWLOC_X86_32_ARCH)
__asm__( __asm__(
"mov %%ebx,%1\n\t" "mov %%ebx,%1\n\t"
"cpuid\n\t" "cpuid\n\t"
"xchg %%ebx,%1\n\t" "xchg %%ebx,%1\n\t"
: "+a" (*eax), "=&SD" (*ebx), "+c" (*ecx), "=&d" (*edx)); : "+a" (*eax), "=&SD" (*ebx), "+c" (*ecx), "=&d" (*edx));
/* See above. */
HWLOC_ANNOTATE_MEMORY_IS_INITIALIZED(ebx, sizeof *ebx);
#else #else
#error unknown architecture #error unknown architecture
#endif #endif

View file

@ -1,6 +1,6 @@
/* /*
* Copyright © 2009 CNRS * Copyright © 2009 CNRS
* Copyright © 2009-2019 Inria. All rights reserved. * Copyright © 2009-2024 Inria. All rights reserved.
* Copyright © 2009-2012 Université Bordeaux * Copyright © 2009-2012 Université Bordeaux
* Copyright © 2011 Cisco Systems, Inc. All rights reserved. * Copyright © 2011 Cisco Systems, Inc. All rights reserved.
* See COPYING in top-level directory. * See COPYING in top-level directory.
@ -573,4 +573,35 @@ typedef SSIZE_T ssize_t;
# endif # endif
#endif #endif
static __inline float
hwloc__pci_link_speed(unsigned generation, unsigned lanes)
{
float lanespeed;
/*
* These are single-direction bandwidths only.
*
* Gen1 used NRZ with 8/10 encoding.
* PCIe Gen1 = 2.5GT/s signal-rate per lane x 8/10 = 0.25GB/s data-rate per lane
* PCIe Gen2 = 5 GT/s signal-rate per lane x 8/10 = 0.5 GB/s data-rate per lane
* Gen3 switched to NRZ with 128/130 encoding.
* PCIe Gen3 = 8 GT/s signal-rate per lane x 128/130 = 1 GB/s data-rate per lane
* PCIe Gen4 = 16 GT/s signal-rate per lane x 128/130 = 2 GB/s data-rate per lane
* PCIe Gen5 = 32 GT/s signal-rate per lane x 128/130 = 4 GB/s data-rate per lane
* Gen6 switched to PAM with with 242/256 FLIT (242B payload protected by 8B CRC + 6B FEC).
* PCIe Gen6 = 64 GT/s signal-rate per lane x 242/256 = 8 GB/s data-rate per lane
* PCIe Gen7 = 128GT/s signal-rate per lane x 242/256 = 16 GB/s data-rate per lane
*/
/* lanespeed in Gbit/s */
if (generation <= 2)
lanespeed = 2.5f * generation * 0.8f;
else if (generation <= 5)
lanespeed = 8.0f * (1<<(generation-3)) * 128/130;
else
lanespeed = 8.0f * (1<<(generation-3)) * 242/256; /* assume Gen8 will be 256 GT/s and so on */
/* linkspeed in GB/s */
return lanespeed * lanes / 8;
}
#endif /* HWLOC_PRIVATE_MISC_H */ #endif /* HWLOC_PRIVATE_MISC_H */

View file

@ -1,6 +1,6 @@
/* /*
* Copyright © 2009 CNRS * Copyright © 2009 CNRS
* Copyright © 2009-2020 Inria. All rights reserved. * Copyright © 2009-2024 Inria. All rights reserved.
* Copyright © 2009-2010, 2012 Université Bordeaux * Copyright © 2009-2010, 2012 Université Bordeaux
* Copyright © 2011-2015 Cisco Systems, Inc. All rights reserved. * Copyright © 2011-2015 Cisco Systems, Inc. All rights reserved.
* See COPYING in top-level directory. * See COPYING in top-level directory.
@ -287,6 +287,7 @@ static __hwloc_inline int hwloc__check_membind_policy(hwloc_membind_policy_t pol
|| policy == HWLOC_MEMBIND_FIRSTTOUCH || policy == HWLOC_MEMBIND_FIRSTTOUCH
|| policy == HWLOC_MEMBIND_BIND || policy == HWLOC_MEMBIND_BIND
|| policy == HWLOC_MEMBIND_INTERLEAVE || policy == HWLOC_MEMBIND_INTERLEAVE
|| policy == HWLOC_MEMBIND_WEIGHTED_INTERLEAVE
|| policy == HWLOC_MEMBIND_NEXTTOUCH) || policy == HWLOC_MEMBIND_NEXTTOUCH)
return 0; return 0;
return -1; return -1;

View file

@ -1,6 +1,6 @@
/* /*
* Copyright © 2009 CNRS * Copyright © 2009 CNRS
* Copyright © 2009-2020 Inria. All rights reserved. * Copyright © 2009-2024 Inria. All rights reserved.
* Copyright © 2009-2011 Université Bordeaux * Copyright © 2009-2011 Université Bordeaux
* Copyright © 2009-2011 Cisco Systems, Inc. All rights reserved. * Copyright © 2009-2011 Cisco Systems, Inc. All rights reserved.
* See COPYING in top-level directory. * See COPYING in top-level directory.
@ -245,6 +245,7 @@ int hwloc_bitmap_copy(struct hwloc_bitmap_s * dst, const struct hwloc_bitmap_s *
/* Strings always use 32bit groups */ /* Strings always use 32bit groups */
#define HWLOC_PRIxSUBBITMAP "%08lx" #define HWLOC_PRIxSUBBITMAP "%08lx"
#define HWLOC_BITMAP_SUBSTRING_SIZE 32 #define HWLOC_BITMAP_SUBSTRING_SIZE 32
#define HWLOC_BITMAP_SUBSTRING_FULL_VALUE 0xFFFFFFFFUL
#define HWLOC_BITMAP_SUBSTRING_LENGTH (HWLOC_BITMAP_SUBSTRING_SIZE/4) #define HWLOC_BITMAP_SUBSTRING_LENGTH (HWLOC_BITMAP_SUBSTRING_SIZE/4)
#define HWLOC_BITMAP_STRING_PER_LONG (HWLOC_BITS_PER_LONG/HWLOC_BITMAP_SUBSTRING_SIZE) #define HWLOC_BITMAP_STRING_PER_LONG (HWLOC_BITS_PER_LONG/HWLOC_BITMAP_SUBSTRING_SIZE)
@ -261,6 +262,7 @@ int hwloc_bitmap_snprintf(char * __hwloc_restrict buf, size_t buflen, const stru
const unsigned long accum_mask = ~0UL; const unsigned long accum_mask = ~0UL;
#else /* HWLOC_BITS_PER_LONG != HWLOC_BITMAP_SUBSTRING_SIZE */ #else /* HWLOC_BITS_PER_LONG != HWLOC_BITMAP_SUBSTRING_SIZE */
const unsigned long accum_mask = ((1UL << HWLOC_BITMAP_SUBSTRING_SIZE) - 1) << (HWLOC_BITS_PER_LONG - HWLOC_BITMAP_SUBSTRING_SIZE); const unsigned long accum_mask = ((1UL << HWLOC_BITMAP_SUBSTRING_SIZE) - 1) << (HWLOC_BITS_PER_LONG - HWLOC_BITMAP_SUBSTRING_SIZE);
int merge_with_infinite_prefix = 0;
#endif /* HWLOC_BITS_PER_LONG != HWLOC_BITMAP_SUBSTRING_SIZE */ #endif /* HWLOC_BITS_PER_LONG != HWLOC_BITMAP_SUBSTRING_SIZE */
HWLOC__BITMAP_CHECK(set); HWLOC__BITMAP_CHECK(set);
@ -279,6 +281,9 @@ int hwloc_bitmap_snprintf(char * __hwloc_restrict buf, size_t buflen, const stru
res = size>0 ? (int)size - 1 : 0; res = size>0 ? (int)size - 1 : 0;
tmp += res; tmp += res;
size -= res; size -= res;
#if HWLOC_BITS_PER_LONG > HWLOC_BITMAP_SUBSTRING_SIZE
merge_with_infinite_prefix = 1;
#endif
} }
i=(int) set->ulongs_count-1; i=(int) set->ulongs_count-1;
@ -294,16 +299,24 @@ int hwloc_bitmap_snprintf(char * __hwloc_restrict buf, size_t buflen, const stru
} }
while (i>=0 || accumed) { while (i>=0 || accumed) {
unsigned long value;
/* Refill accumulator */ /* Refill accumulator */
if (!accumed) { if (!accumed) {
accum = set->ulongs[i--]; accum = set->ulongs[i--];
accumed = HWLOC_BITS_PER_LONG; accumed = HWLOC_BITS_PER_LONG;
} }
value = (accum & accum_mask) >> (HWLOC_BITS_PER_LONG - HWLOC_BITMAP_SUBSTRING_SIZE);
if (accum & accum_mask) { #if HWLOC_BITS_PER_LONG > HWLOC_BITMAP_SUBSTRING_SIZE
if (merge_with_infinite_prefix && value == HWLOC_BITMAP_SUBSTRING_FULL_VALUE) {
/* first full subbitmap merged with infinite prefix */
res = 0;
} else
#endif
if (value) {
/* print the whole subset if not empty */ /* print the whole subset if not empty */
res = hwloc_snprintf(tmp, size, needcomma ? ",0x" HWLOC_PRIxSUBBITMAP : "0x" HWLOC_PRIxSUBBITMAP, res = hwloc_snprintf(tmp, size, needcomma ? ",0x" HWLOC_PRIxSUBBITMAP : "0x" HWLOC_PRIxSUBBITMAP, value);
(accum & accum_mask) >> (HWLOC_BITS_PER_LONG - HWLOC_BITMAP_SUBSTRING_SIZE));
needcomma = 1; needcomma = 1;
} else if (i == -1 && accumed == HWLOC_BITMAP_SUBSTRING_SIZE) { } else if (i == -1 && accumed == HWLOC_BITMAP_SUBSTRING_SIZE) {
/* print a single 0 to mark the last subset */ /* print a single 0 to mark the last subset */
@ -323,6 +336,7 @@ int hwloc_bitmap_snprintf(char * __hwloc_restrict buf, size_t buflen, const stru
#else #else
accum <<= HWLOC_BITMAP_SUBSTRING_SIZE; accum <<= HWLOC_BITMAP_SUBSTRING_SIZE;
accumed -= HWLOC_BITMAP_SUBSTRING_SIZE; accumed -= HWLOC_BITMAP_SUBSTRING_SIZE;
merge_with_infinite_prefix = 0;
#endif #endif
if (res >= size) if (res >= size)
@ -362,7 +376,8 @@ int hwloc_bitmap_sscanf(struct hwloc_bitmap_s *set, const char * __hwloc_restric
{ {
const char * current = string; const char * current = string;
unsigned long accum = 0; unsigned long accum = 0;
int count=0; int count = 0;
int ulongcount;
int infinite = 0; int infinite = 0;
/* count how many substrings there are */ /* count how many substrings there are */
@ -383,9 +398,20 @@ int hwloc_bitmap_sscanf(struct hwloc_bitmap_s *set, const char * __hwloc_restric
count--; count--;
} }
if (hwloc_bitmap_reset_by_ulongs(set, (count + HWLOC_BITMAP_STRING_PER_LONG - 1) / HWLOC_BITMAP_STRING_PER_LONG) < 0) ulongcount = (count + HWLOC_BITMAP_STRING_PER_LONG - 1) / HWLOC_BITMAP_STRING_PER_LONG;
if (hwloc_bitmap_reset_by_ulongs(set, ulongcount) < 0)
return -1; return -1;
set->infinite = 0;
set->infinite = 0; /* will be updated later */
#if HWLOC_BITS_PER_LONG != HWLOC_BITMAP_SUBSTRING_SIZE
if (infinite && (count % HWLOC_BITMAP_STRING_PER_LONG) != 0) {
/* accumulate substrings of the first ulong that are hidden in the infinite prefix */
int i;
for(i = (count % HWLOC_BITMAP_STRING_PER_LONG); i < HWLOC_BITMAP_STRING_PER_LONG; i++)
accum |= (HWLOC_BITMAP_SUBSTRING_FULL_VALUE << (i*HWLOC_BITMAP_SUBSTRING_SIZE));
}
#endif
while (*current != '\0') { while (*current != '\0') {
unsigned long val; unsigned long val;
@ -544,6 +570,9 @@ int hwloc_bitmap_taskset_snprintf(char * __hwloc_restrict buf, size_t buflen, co
ssize_t size = buflen; ssize_t size = buflen;
char *tmp = buf; char *tmp = buf;
int res, ret = 0; int res, ret = 0;
#if HWLOC_BITS_PER_LONG == 64
int merge_with_infinite_prefix = 0;
#endif
int started = 0; int started = 0;
int i; int i;
@ -563,6 +592,9 @@ int hwloc_bitmap_taskset_snprintf(char * __hwloc_restrict buf, size_t buflen, co
res = size>0 ? (int)size - 1 : 0; res = size>0 ? (int)size - 1 : 0;
tmp += res; tmp += res;
size -= res; size -= res;
#if HWLOC_BITS_PER_LONG == 64
merge_with_infinite_prefix = 1;
#endif
} }
i=set->ulongs_count-1; i=set->ulongs_count-1;
@ -582,7 +614,11 @@ int hwloc_bitmap_taskset_snprintf(char * __hwloc_restrict buf, size_t buflen, co
if (started) { if (started) {
/* print the whole subset */ /* print the whole subset */
#if HWLOC_BITS_PER_LONG == 64 #if HWLOC_BITS_PER_LONG == 64
if (merge_with_infinite_prefix && (val & 0xffffffff00000000UL) == 0xffffffff00000000UL) {
res = hwloc_snprintf(tmp, size, "%08lx", val & 0xffffffffUL);
} else {
res = hwloc_snprintf(tmp, size, "%016lx", val); res = hwloc_snprintf(tmp, size, "%016lx", val);
}
#else #else
res = hwloc_snprintf(tmp, size, "%08lx", val); res = hwloc_snprintf(tmp, size, "%08lx", val);
#endif #endif
@ -599,6 +635,9 @@ int hwloc_bitmap_taskset_snprintf(char * __hwloc_restrict buf, size_t buflen, co
res = size>0 ? (int)size - 1 : 0; res = size>0 ? (int)size - 1 : 0;
tmp += res; tmp += res;
size -= res; size -= res;
#if HWLOC_BITS_PER_LONG == 64
merge_with_infinite_prefix = 0;
#endif
} }
/* if didn't display anything, display 0x0 */ /* if didn't display anything, display 0x0 */
@ -679,6 +718,10 @@ int hwloc_bitmap_taskset_sscanf(struct hwloc_bitmap_s *set, const char * __hwloc
goto failed; goto failed;
set->ulongs[count-1] = val; set->ulongs[count-1] = val;
if (infinite && tmpchars != HWLOC_BITS_PER_LONG/4) {
/* infinite prefix with partial substring, fill remaining bits */
set->ulongs[count-1] |= (~0ULL)<<(4*tmpchars);
}
current += tmpchars; current += tmpchars;
chars -= tmpchars; chars -= tmpchars;

View file

@ -1,5 +1,5 @@
/* /*
* Copyright © 2020-2022 Inria. All rights reserved. * Copyright © 2020-2024 Inria. All rights reserved.
* See COPYING in top-level directory. * See COPYING in top-level directory.
*/ */
@ -50,6 +50,7 @@ hwloc_internal_cpukinds_dup(hwloc_topology_t new, hwloc_topology_t old)
return -1; return -1;
new->cpukinds = kinds; new->cpukinds = kinds;
new->nr_cpukinds = old->nr_cpukinds; new->nr_cpukinds = old->nr_cpukinds;
new->nr_cpukinds_allocated = old->nr_cpukinds;
memcpy(kinds, old->cpukinds, old->nr_cpukinds * sizeof(*kinds)); memcpy(kinds, old->cpukinds, old->nr_cpukinds * sizeof(*kinds));
for(i=0;i<old->nr_cpukinds; i++) { for(i=0;i<old->nr_cpukinds; i++) {

View file

@ -1,5 +1,5 @@
/* /*
* Copyright © 2010-2022 Inria. All rights reserved. * Copyright © 2010-2024 Inria. All rights reserved.
* Copyright © 2011-2012 Université Bordeaux * Copyright © 2011-2012 Université Bordeaux
* Copyright © 2011 Cisco Systems, Inc. All rights reserved. * Copyright © 2011 Cisco Systems, Inc. All rights reserved.
* See COPYING in top-level directory. * See COPYING in top-level directory.
@ -624,8 +624,8 @@ void * hwloc_distances_add_create(hwloc_topology_t topology,
return NULL; return NULL;
} }
if ((kind & ~HWLOC_DISTANCES_KIND_ALL) if ((kind & ~HWLOC_DISTANCES_KIND_ALL)
|| hwloc_weight_long(kind & HWLOC_DISTANCES_KIND_FROM_ALL) != 1 || hwloc_weight_long(kind & HWLOC_DISTANCES_KIND_FROM_ALL) > 1
|| hwloc_weight_long(kind & HWLOC_DISTANCES_KIND_MEANS_ALL) != 1) { || hwloc_weight_long(kind & HWLOC_DISTANCES_KIND_MEANS_ALL) > 1) {
errno = EINVAL; errno = EINVAL;
return NULL; return NULL;
} }

View file

@ -1,5 +1,5 @@
/* /*
* Copyright © 2020-2023 Inria. All rights reserved. * Copyright © 2020-2024 Inria. All rights reserved.
* See COPYING in top-level directory. * See COPYING in top-level directory.
*/ */
@ -14,13 +14,26 @@
*/ */
static __hwloc_inline static __hwloc_inline
hwloc_uint64_t hwloc__memattr_get_convenience_value(hwloc_memattr_id_t id, int hwloc__memattr_get_convenience_value(hwloc_memattr_id_t id,
hwloc_obj_t node) hwloc_obj_t node,
hwloc_uint64_t *valuep)
{ {
if (id == HWLOC_MEMATTR_ID_CAPACITY) if (id == HWLOC_MEMATTR_ID_CAPACITY) {
return node->attr->numanode.local_memory; if (node->type != HWLOC_OBJ_NUMANODE) {
else if (id == HWLOC_MEMATTR_ID_LOCALITY) errno = EINVAL;
return hwloc_bitmap_weight(node->cpuset); return -1;
}
*valuep = node->attr->numanode.local_memory;
return 0;
}
else if (id == HWLOC_MEMATTR_ID_LOCALITY) {
if (!node->cpuset) {
errno = EINVAL;
return -1;
}
*valuep = hwloc_bitmap_weight(node->cpuset);
return 0;
}
else else
assert(0); assert(0);
return 0; /* shut up the compiler */ return 0; /* shut up the compiler */
@ -622,7 +635,7 @@ hwloc_memattr_get_targets(hwloc_topology_t topology,
if (found<max) { if (found<max) {
targets[found] = node; targets[found] = node;
if (values) if (values)
values[found] = hwloc__memattr_get_convenience_value(id, node); hwloc__memattr_get_convenience_value(id, node, &values[found]);
} }
found++; found++;
} }
@ -748,7 +761,7 @@ hwloc_memattr_get_initiators(hwloc_topology_t topology,
struct hwloc_internal_memattr_target_s *imtg; struct hwloc_internal_memattr_target_s *imtg;
unsigned i, max; unsigned i, max;
if (flags) { if (flags || !target_node) {
errno = EINVAL; errno = EINVAL;
return -1; return -1;
} }
@ -810,7 +823,7 @@ hwloc_memattr_get_value(hwloc_topology_t topology,
struct hwloc_internal_memattr_s *imattr; struct hwloc_internal_memattr_s *imattr;
struct hwloc_internal_memattr_target_s *imtg; struct hwloc_internal_memattr_target_s *imtg;
if (flags) { if (flags || !target_node) {
errno = EINVAL; errno = EINVAL;
return -1; return -1;
} }
@ -823,8 +836,7 @@ hwloc_memattr_get_value(hwloc_topology_t topology,
if (imattr->iflags & HWLOC_IMATTR_FLAG_CONVENIENCE) { if (imattr->iflags & HWLOC_IMATTR_FLAG_CONVENIENCE) {
/* convenience attributes */ /* convenience attributes */
*valuep = hwloc__memattr_get_convenience_value(id, target_node); return hwloc__memattr_get_convenience_value(id, target_node, valuep);
return 0;
} }
/* normal attributes */ /* normal attributes */
@ -936,7 +948,7 @@ hwloc_memattr_set_value(hwloc_topology_t topology,
{ {
struct hwloc_internal_location_s iloc, *ilocp; struct hwloc_internal_location_s iloc, *ilocp;
if (flags) { if (flags || !target_node) {
errno = EINVAL; errno = EINVAL;
return -1; return -1;
} }
@ -1007,10 +1019,10 @@ hwloc_memattr_get_best_target(hwloc_topology_t topology,
/* convenience attributes */ /* convenience attributes */
for(j=0; ; j++) { for(j=0; ; j++) {
hwloc_obj_t node = hwloc_get_obj_by_type(topology, HWLOC_OBJ_NUMANODE, j); hwloc_obj_t node = hwloc_get_obj_by_type(topology, HWLOC_OBJ_NUMANODE, j);
hwloc_uint64_t value; hwloc_uint64_t value = 0;
if (!node) if (!node)
break; break;
value = hwloc__memattr_get_convenience_value(id, node); hwloc__memattr_get_convenience_value(id, node, &value);
hwloc__update_best_target(&best, &best_value, &found, hwloc__update_best_target(&best, &best_value, &found,
node, value, node, value,
imattr->flags & HWLOC_MEMATTR_FLAG_HIGHER_FIRST); imattr->flags & HWLOC_MEMATTR_FLAG_HIGHER_FIRST);
@ -1093,7 +1105,7 @@ hwloc_memattr_get_best_initiator(hwloc_topology_t topology,
int found; int found;
unsigned i; unsigned i;
if (flags) { if (flags || !target_node) {
errno = EINVAL; errno = EINVAL;
return -1; return -1;
} }
@ -1806,6 +1818,12 @@ hwloc__apply_memory_tiers_subtypes(hwloc_topology_t topology,
} }
} }
} }
if (nr_tiers > 1) {
hwloc_obj_t root = hwloc_get_root_obj(topology);
char tmp[20];
snprintf(tmp, sizeof(tmp), "%u", nr_tiers);
hwloc__add_info_nodup(&root->infos, &root->infos_count, "MemoryTiersNr", tmp, 1);
}
} }
int int

View file

@ -1,5 +1,5 @@
/* /*
* Copyright © 2009-2022 Inria. All rights reserved. * Copyright © 2009-2024 Inria. All rights reserved.
* See COPYING in top-level directory. * See COPYING in top-level directory.
*/ */
@ -886,36 +886,12 @@ hwloc_pcidisc_find_linkspeed(const unsigned char *config,
unsigned offset, float *linkspeed) unsigned offset, float *linkspeed)
{ {
unsigned linksta, speed, width; unsigned linksta, speed, width;
float lanespeed;
memcpy(&linksta, &config[offset + HWLOC_PCI_EXP_LNKSTA], 4); memcpy(&linksta, &config[offset + HWLOC_PCI_EXP_LNKSTA], 4);
speed = linksta & HWLOC_PCI_EXP_LNKSTA_SPEED; /* PCIe generation */ speed = linksta & HWLOC_PCI_EXP_LNKSTA_SPEED; /* PCIe generation */
width = (linksta & HWLOC_PCI_EXP_LNKSTA_WIDTH) >> 4; /* how many lanes */ width = (linksta & HWLOC_PCI_EXP_LNKSTA_WIDTH) >> 4; /* how many lanes */
/*
* These are single-direction bandwidths only.
*
* Gen1 used NRZ with 8/10 encoding.
* PCIe Gen1 = 2.5GT/s signal-rate per lane x 8/10 = 0.25GB/s data-rate per lane
* PCIe Gen2 = 5 GT/s signal-rate per lane x 8/10 = 0.5 GB/s data-rate per lane
* Gen3 switched to NRZ with 128/130 encoding.
* PCIe Gen3 = 8 GT/s signal-rate per lane x 128/130 = 1 GB/s data-rate per lane
* PCIe Gen4 = 16 GT/s signal-rate per lane x 128/130 = 2 GB/s data-rate per lane
* PCIe Gen5 = 32 GT/s signal-rate per lane x 128/130 = 4 GB/s data-rate per lane
* Gen6 switched to PAM with with 242/256 FLIT (242B payload protected by 8B CRC + 6B FEC).
* PCIe Gen6 = 64 GT/s signal-rate per lane x 242/256 = 8 GB/s data-rate per lane
* PCIe Gen7 = 128GT/s signal-rate per lane x 242/256 = 16 GB/s data-rate per lane
*/
/* lanespeed in Gbit/s */ *linkspeed = hwloc__pci_link_speed(speed, width);
if (speed <= 2)
lanespeed = 2.5f * speed * 0.8f;
else if (speed <= 5)
lanespeed = 8.0f * (1<<(speed-3)) * 128/130;
else
lanespeed = 8.0f * (1<<(speed-3)) * 242/256; /* assume Gen8 will be 256 GT/s and so on */
/* linkspeed in GB/s */
*linkspeed = lanespeed * width / 8;
return 0; return 0;
} }

View file

@ -1,6 +1,6 @@
/* /*
* Copyright © 2009 CNRS * Copyright © 2009 CNRS
* Copyright © 2009-2023 Inria. All rights reserved. * Copyright © 2009-2024 Inria. All rights reserved.
* Copyright © 2009-2012, 2020 Université Bordeaux * Copyright © 2009-2012, 2020 Université Bordeaux
* Copyright © 2011 Cisco Systems, Inc. All rights reserved. * Copyright © 2011 Cisco Systems, Inc. All rights reserved.
* See COPYING in top-level directory. * See COPYING in top-level directory.
@ -220,7 +220,7 @@ static void hwloc_win_get_function_ptrs(void)
#pragma GCC diagnostic ignored "-Wcast-function-type" #pragma GCC diagnostic ignored "-Wcast-function-type"
#endif #endif
kernel32 = LoadLibrary("kernel32.dll"); kernel32 = LoadLibrary(TEXT("kernel32.dll"));
if (kernel32) { if (kernel32) {
GetActiveProcessorGroupCountProc = GetActiveProcessorGroupCountProc =
(PFN_GETACTIVEPROCESSORGROUPCOUNT) GetProcAddress(kernel32, "GetActiveProcessorGroupCount"); (PFN_GETACTIVEPROCESSORGROUPCOUNT) GetProcAddress(kernel32, "GetActiveProcessorGroupCount");
@ -249,12 +249,12 @@ static void hwloc_win_get_function_ptrs(void)
} }
if (!QueryWorkingSetExProc) { if (!QueryWorkingSetExProc) {
HMODULE psapi = LoadLibrary("psapi.dll"); HMODULE psapi = LoadLibrary(TEXT("psapi.dll"));
if (psapi) if (psapi)
QueryWorkingSetExProc = (PFN_QUERYWORKINGSETEX) GetProcAddress(psapi, "QueryWorkingSetEx"); QueryWorkingSetExProc = (PFN_QUERYWORKINGSETEX) GetProcAddress(psapi, "QueryWorkingSetEx");
} }
ntdll = GetModuleHandle("ntdll"); ntdll = GetModuleHandle(TEXT("ntdll"));
RtlGetVersionProc = (PFN_RTLGETVERSION) GetProcAddress(ntdll, "RtlGetVersion"); RtlGetVersionProc = (PFN_RTLGETVERSION) GetProcAddress(ntdll, "RtlGetVersion");
#if HWLOC_HAVE_GCC_W_CAST_FUNCTION_TYPE #if HWLOC_HAVE_GCC_W_CAST_FUNCTION_TYPE

View file

@ -1,11 +1,11 @@
/* /*
* Copyright © 2010-2023 Inria. All rights reserved. * Copyright © 2010-2024 Inria. All rights reserved.
* Copyright © 2010-2013 Université Bordeaux * Copyright © 2010-2013 Université Bordeaux
* Copyright © 2010-2011 Cisco Systems, Inc. All rights reserved. * Copyright © 2010-2011 Cisco Systems, Inc. All rights reserved.
* See COPYING in top-level directory. * See COPYING in top-level directory.
* *
* *
* This backend is only used when the operating system does not export * This backend is mostly used when the operating system does not export
* the necessary hardware topology information to user-space applications. * the necessary hardware topology information to user-space applications.
* Currently, FreeBSD and NetBSD only add PUs and then fallback to this * Currently, FreeBSD and NetBSD only add PUs and then fallback to this
* backend for CPU/Cache discovery. * backend for CPU/Cache discovery.
@ -15,6 +15,7 @@
* on various architectures, without having to use this x86-specific code. * on various architectures, without having to use this x86-specific code.
* But this backend is still used after them to annotate some objects with * But this backend is still used after them to annotate some objects with
* additional details (CPU info in Package, Inclusiveness in Caches). * additional details (CPU info in Package, Inclusiveness in Caches).
* It may also be enabled manually to work-around bugs in native OS discovery.
*/ */
#include "private/autogen/config.h" #include "private/autogen/config.h"
@ -487,7 +488,7 @@ static void read_amd_cores_legacy(struct procinfo *infos, struct cpuiddump *src_
} }
/* AMD unit/node from CPUID 0x8000001e leaf (topoext) */ /* AMD unit/node from CPUID 0x8000001e leaf (topoext) */
static void read_amd_cores_topoext(struct hwloc_x86_backend_data_s *data, struct procinfo *infos, unsigned long flags, struct cpuiddump *src_cpuiddump) static void read_amd_cores_topoext(struct hwloc_x86_backend_data_s *data, struct procinfo *infos, unsigned long flags __hwloc_attribute_unused, struct cpuiddump *src_cpuiddump)
{ {
unsigned apic_id, nodes_per_proc = 0; unsigned apic_id, nodes_per_proc = 0;
unsigned eax, ebx, ecx, edx; unsigned eax, ebx, ecx, edx;
@ -496,7 +497,6 @@ static void read_amd_cores_topoext(struct hwloc_x86_backend_data_s *data, struct
cpuid_or_from_dump(&eax, &ebx, &ecx, &edx, src_cpuiddump); cpuid_or_from_dump(&eax, &ebx, &ecx, &edx, src_cpuiddump);
infos->apicid = apic_id = eax; infos->apicid = apic_id = eax;
if (flags & HWLOC_X86_DISC_FLAG_TOPOEXT_NUMANODES) {
if (infos->cpufamilynumber == 0x16) { if (infos->cpufamilynumber == 0x16) {
/* ecx is reserved */ /* ecx is reserved */
infos->ids[NODE] = 0; infos->ids[NODE] = 0;
@ -511,7 +511,6 @@ static void read_amd_cores_topoext(struct hwloc_x86_backend_data_s *data, struct
|| (infos->cpufamilynumber == 0x19 && nodes_per_proc > 1)) { || (infos->cpufamilynumber == 0x19 && nodes_per_proc > 1)) {
hwloc_debug("warning: undefined nodes_per_proc value %u, assuming it means %u\n", nodes_per_proc, nodes_per_proc); hwloc_debug("warning: undefined nodes_per_proc value %u, assuming it means %u\n", nodes_per_proc, nodes_per_proc);
} }
}
if (infos->cpufamilynumber <= 0x16) { /* topoext appeared in 0x15 and compute-units were only used in 0x15 and 0x16 */ if (infos->cpufamilynumber <= 0x16) { /* topoext appeared in 0x15 and compute-units were only used in 0x15 and 0x16 */
unsigned cores_per_unit; unsigned cores_per_unit;
@ -533,9 +532,9 @@ static void read_amd_cores_topoext(struct hwloc_x86_backend_data_s *data, struct
} }
/* Intel core/thread or even die/module/tile from CPUID 0x0b or 0x1f leaves (v1 and v2 extended topology enumeration) /* Intel core/thread or even die/module/tile from CPUID 0x0b or 0x1f leaves (v1 and v2 extended topology enumeration)
* or AMD complex/ccd from CPUID 0x80000026 (extended CPU topology) * or AMD core/thread or even complex/ccd from CPUID 0x0b or 0x80000026 (extended CPU topology)
*/ */
static void read_extended_topo(struct hwloc_x86_backend_data_s *data, struct procinfo *infos, unsigned leaf, enum cpuid_type cpuid_type, struct cpuiddump *src_cpuiddump) static void read_extended_topo(struct hwloc_x86_backend_data_s *data, struct procinfo *infos, unsigned leaf, enum cpuid_type cpuid_type __hwloc_attribute_unused, struct cpuiddump *src_cpuiddump)
{ {
unsigned level, apic_nextshift, apic_type, apic_id = 0, apic_shift = 0, id; unsigned level, apic_nextshift, apic_type, apic_id = 0, apic_shift = 0, id;
unsigned threadid __hwloc_attribute_unused = 0; /* shut-up compiler */ unsigned threadid __hwloc_attribute_unused = 0; /* shut-up compiler */
@ -547,20 +546,15 @@ static void read_extended_topo(struct hwloc_x86_backend_data_s *data, struct pro
eax = leaf; eax = leaf;
cpuid_or_from_dump(&eax, &ebx, &ecx, &edx, src_cpuiddump); cpuid_or_from_dump(&eax, &ebx, &ecx, &edx, src_cpuiddump);
/* Intel specifies that the 0x0b/0x1f loop should stop when we get "invalid domain" (0 in ecx[8:15]) /* Intel specifies that the 0x0b/0x1f loop should stop when we get "invalid domain" (0 in ecx[8:15])
* (if so, we also get 0 in eax/ebx for invalid subleaves). * (if so, we also get 0 in eax/ebx for invalid subleaves). Zhaoxin implements this too.
* However AMD rather says that the 0x80000026/0x0b loop should stop when we get "no thread at this level" (0 in ebx[0:15]). * However AMD rather says that the 0x80000026/0x0b loop should stop when we get "no thread at this level" (0 in ebx[0:15]).
* Zhaoxin follows the Intel specs but also returns "no thread at this level" for the last *valid* level (at least on KH-4000). *
* From the Linux kernel code, it's very likely that AMD also returns "invalid domain" * Linux kernel <= 6.8 used "invalid domain" for both Intel and AMD (in detect_extended_topology())
* (because detect_extended_topology() uses that for all x86 CPUs) * but x86 discovery revamp in 6.9 now properly checks both Intel and AMD conditions (in topo_subleaf()).
* but keep with the official doc until AMD can clarify that (see #593). * So let's assume we are allowed to break-out once one of the Intel+AMD conditions is met.
*/ */
if (cpuid_type == amd) { if (!(ebx & 0xffff) || !(ecx & 0xff00))
if (!(ebx & 0xffff))
break; break;
} else {
if (!(ecx & 0xff00))
break;
}
apic_packageshift = eax & 0x1f; apic_packageshift = eax & 0x1f;
} }
@ -572,13 +566,8 @@ static void read_extended_topo(struct hwloc_x86_backend_data_s *data, struct pro
ecx = level; ecx = level;
eax = leaf; eax = leaf;
cpuid_or_from_dump(&eax, &ebx, &ecx, &edx, src_cpuiddump); cpuid_or_from_dump(&eax, &ebx, &ecx, &edx, src_cpuiddump);
if (cpuid_type == amd) { if (!(ebx & 0xffff) || !(ecx & 0xff00))
if (!(ebx & 0xffff))
break; break;
} else {
if (!(ecx & 0xff00))
break;
}
apic_nextshift = eax & 0x1f; apic_nextshift = eax & 0x1f;
apic_type = (ecx & 0xff00) >> 8; apic_type = (ecx & 0xff00) >> 8;
apic_id = edx; apic_id = edx;
@ -1825,7 +1814,7 @@ hwloc_x86_check_cpuiddump_input(const char *src_cpuiddump_path, hwloc_bitmap_t s
goto out_with_path; goto out_with_path;
} }
fclose(file); fclose(file);
if (strcmp(line, "Architecture: x86\n")) { if (strncmp(line, "Architecture: x86", 17)) {
fprintf(stderr, "hwloc/x86: Found non-x86 dumped cpuid summary in %s: %s\n", path, line); fprintf(stderr, "hwloc/x86: Found non-x86 dumped cpuid summary in %s: %s\n", path, line);
goto out_with_path; goto out_with_path;
} }

View file

@ -1,6 +1,6 @@
/* /*
* Copyright © 2009 CNRS * Copyright © 2009 CNRS
* Copyright © 2009-2020 Inria. All rights reserved. * Copyright © 2009-2024 Inria. All rights reserved.
* Copyright © 2009-2011 Université Bordeaux * Copyright © 2009-2011 Université Bordeaux
* Copyright © 2009-2011 Cisco Systems, Inc. All rights reserved. * Copyright © 2009-2011 Cisco Systems, Inc. All rights reserved.
* See COPYING in top-level directory. * See COPYING in top-level directory.
@ -41,7 +41,7 @@ typedef struct hwloc__nolibxml_import_state_data_s {
static char * static char *
hwloc__nolibxml_import_ignore_spaces(char *buffer) hwloc__nolibxml_import_ignore_spaces(char *buffer)
{ {
return buffer + strspn(buffer, " \t\n"); return buffer + strspn(buffer, " \t\n\r");
} }
static int static int

View file

@ -1,6 +1,6 @@
/* /*
* Copyright © 2009 CNRS * Copyright © 2009 CNRS
* Copyright © 2009-2023 Inria. All rights reserved. * Copyright © 2009-2024 Inria. All rights reserved.
* Copyright © 2009-2011, 2020 Université Bordeaux * Copyright © 2009-2011, 2020 Université Bordeaux
* Copyright © 2009-2018 Cisco Systems, Inc. All rights reserved. * Copyright © 2009-2018 Cisco Systems, Inc. All rights reserved.
* See COPYING in top-level directory. * See COPYING in top-level directory.
@ -872,6 +872,10 @@ hwloc__xml_import_object(hwloc_topology_t topology,
/* deal with possible future type */ /* deal with possible future type */
obj->type = HWLOC_OBJ_GROUP; obj->type = HWLOC_OBJ_GROUP;
obj->attr->group.kind = HWLOC_GROUP_KIND_INTEL_MODULE; obj->attr->group.kind = HWLOC_GROUP_KIND_INTEL_MODULE;
} else if (!strcasecmp(attrvalue, "Cluster")) {
/* deal with possible future type */
obj->type = HWLOC_OBJ_GROUP;
obj->attr->group.kind = HWLOC_GROUP_KIND_LINUX_CLUSTER;
} else if (!strcasecmp(attrvalue, "MemCache")) { } else if (!strcasecmp(attrvalue, "MemCache")) {
/* ignore possible future type */ /* ignore possible future type */
obj->type = _HWLOC_OBJ_FUTURE; obj->type = _HWLOC_OBJ_FUTURE;
@ -1344,7 +1348,7 @@ hwloc__xml_v2import_support(hwloc_topology_t topology,
HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_support) == 4*sizeof(void*)); HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_support) == 4*sizeof(void*));
HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_discovery_support) == 6); HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_discovery_support) == 6);
HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_cpubind_support) == 11); HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_cpubind_support) == 11);
HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_membind_support) == 15); HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_membind_support) == 16);
HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_misc_support) == 1); HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_misc_support) == 1);
#endif #endif
@ -1378,6 +1382,7 @@ hwloc__xml_v2import_support(hwloc_topology_t topology,
else DO(membind,firsttouch_membind); else DO(membind,firsttouch_membind);
else DO(membind,bind_membind); else DO(membind,bind_membind);
else DO(membind,interleave_membind); else DO(membind,interleave_membind);
else DO(membind,weighted_interleave_membind);
else DO(membind,nexttouch_membind); else DO(membind,nexttouch_membind);
else DO(membind,migrate_membind); else DO(membind,migrate_membind);
else DO(membind,get_area_memlocation); else DO(membind,get_area_memlocation);
@ -1436,6 +1441,10 @@ hwloc__xml_v2import_distances(hwloc_topology_t topology,
} }
else if (!strcmp(attrname, "kind")) { else if (!strcmp(attrname, "kind")) {
kind = strtoul(attrvalue, NULL, 10); kind = strtoul(attrvalue, NULL, 10);
/* forward compat with "HOPS" kind in v3 */
if (kind & (1UL<<5))
/* hops becomes latency */
kind = (kind & ~(1UL<<5)) | HWLOC_DISTANCES_KIND_MEANS_LATENCY;
} }
else if (!strcmp(attrname, "name")) { else if (!strcmp(attrname, "name")) {
name = attrvalue; name = attrvalue;
@ -3087,7 +3096,7 @@ hwloc__xml_v2export_support(hwloc__xml_export_state_t parentstate, hwloc_topolog
HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_support) == 4*sizeof(void*)); HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_support) == 4*sizeof(void*));
HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_discovery_support) == 6); HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_discovery_support) == 6);
HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_cpubind_support) == 11); HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_cpubind_support) == 11);
HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_membind_support) == 15); HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_membind_support) == 16);
HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_misc_support) == 1); HWLOC_BUILD_ASSERT(sizeof(struct hwloc_topology_misc_support) == 1);
#endif #endif
@ -3132,6 +3141,7 @@ hwloc__xml_v2export_support(hwloc__xml_export_state_t parentstate, hwloc_topolog
DO(membind,firsttouch_membind); DO(membind,firsttouch_membind);
DO(membind,bind_membind); DO(membind,bind_membind);
DO(membind,interleave_membind); DO(membind,interleave_membind);
DO(membind,weighted_interleave_membind);
DO(membind,nexttouch_membind); DO(membind,nexttouch_membind);
DO(membind,migrate_membind); DO(membind,migrate_membind);
DO(membind,get_area_memlocation); DO(membind,get_area_memlocation);

View file

@ -465,6 +465,20 @@ hwloc_debug_print_objects(int indent __hwloc_attribute_unused, hwloc_obj_t obj)
#define hwloc_debug_print_objects(indent, obj) do { /* nothing */ } while (0) #define hwloc_debug_print_objects(indent, obj) do { /* nothing */ } while (0)
#endif /* !HWLOC_DEBUG */ #endif /* !HWLOC_DEBUG */
int hwloc_obj_set_subtype(hwloc_topology_t topology __hwloc_attribute_unused, hwloc_obj_t obj, const char *subtype)
{
char *new = NULL;
if (subtype) {
new = strdup(subtype);
if (!new)
return -1;
}
if (obj->subtype)
free(obj->subtype);
obj->subtype = new;
return 0;
}
void hwloc__free_infos(struct hwloc_info_s *infos, unsigned count) void hwloc__free_infos(struct hwloc_info_s *infos, unsigned count)
{ {
unsigned i; unsigned i;