Histogram bin counts
collapse all in page
Syntax
[N,edges]= histcounts(X)
[N,edges]= histcounts(X,nbins)
[N,edges]= histcounts(X,edges)
[N,edges,bin]= histcounts(___)
N = histcounts(C)
N = histcounts(C,Categories)
[N,Categories]= histcounts(___)
[___] = histcounts(___,Name,Value)
Description
example
[N,edges]= histcounts(X)
partitions the X
values into bins and returns the bin counts and the bin edges. The histcounts
function uses an automatic binning algorithm that returns uniform bins chosen to cover the range of elements in X
and reveal the underlying shape of the distribution.
example
[N,edges]= histcounts(X,nbins)
usesa number of bins specified by the scalar, nbins
.
example
[N,edges]= histcounts(X,edges)
sorts X
into bins with the bin edges specified by the vector, edges
.
example
[N,edges,bin]= histcounts(___)
also returns an index array, bin
,using any of the previous syntaxes. bin
is an arrayof the same size as X
whose elements are the binindices for the corresponding elements in X
. Thenumber of elements in the k
th bin is nnz(bin==k)
,which is the same as N(k)
.
example
N = histcounts(C)
,where C
is a categorical array, returns a vector, N
,that indicates the number of elements in C
whosevalue is equal to each of C
’s categories. N
hasone element for each category in C
.
N = histcounts(C,Categories)
countsonly the elements in C
whose value is equal tothe subset of categories specified by Categories
.
example
[N,Categories]= histcounts(___)
also returns the categoriesthat correspond to each count in N
using eitherof the previous syntaxes for categorical arrays.
example
[___] = histcounts(___,Name,Value)
specifies additional parameters using one or more name-value arguments. For example, you can specify BinWidth
as a scalar to adjust the width of the bins for numeric data.
Examples
collapse all
Bin Counts and Bin Edges
Open Live Script
Distribute 100 random values into bins. histcounts
automatically chooses an appropriate bin width to reveal the underlying distribution of the data.
X = randn(100,1);[N,edges] = histcounts(X)
N = 1×7 2 17 28 32 16 3 2
edges = 1×8 -3 -2 -1 0 1 2 3 4
Specify Number of Bins
Open Live Script
Distribute 10 numbers into 6 equally spaced bins.
X = [2 3 5 7 11 13 17 19 23 29];[N,edges] = histcounts(X,6)
N = 1×6 2 2 2 2 1 1
edges = 1×7 0 4.9000 9.8000 14.7000 19.6000 24.5000 29.4000
Specify Bin Edges
Open Live Script
Distribute 1,000 random numbers into bins. Define the bin edges with a vector, where the first element is the left edge of the first bin, and the last element is the right edge of the last bin.
X = randn(1000,1);edges = [-5 -4 -2 -1 -0.5 0 0.5 1 2 4 5];N = histcounts(X,edges)
N = 1×10 0 24 149 142 195 200 154 111 25 0
Normalized Bin Counts
Open Live Script
Distribute all of the prime numbers less than 100 into bins. Specify 'Normalization'
as 'probability'
to normalize the bin counts so that sum(N)
is 1
. That is, each bin count represents the probability that an observation falls within that bin.
X = primes(100);[N,edges] = histcounts(X, 'Normalization', 'probability')
N = 1×4 0.4000 0.2800 0.2800 0.0400
edges = 1×5 0 30 60 90 120
Determine Bin Placement
Open Live Script
Distribute 100 random integers between -5 and 5 into bins, and specify 'BinMethod'
as 'integers'
to use unit-width bins centered on integers. Specify a third output for histcounts
to return a vector representing the bin indices of the data.
X = randi([-5,5],100,1);[N,edges,bin] = histcounts(X,'BinMethod','integers');
Find the bin count for the third bin by counting the occurrences of the number 3
in the bin index vector, bin
. The result is the same as N(3)
.
count = nnz(bin==3)
count = 8
Categorical Bin Counts
Open Live Script
Create a categorical vector that represents votes. The categories in the vector are 'yes'
, 'no'
, or 'undecided'
.
A = [0 0 1 1 1 0 0 0 0 NaN NaN 1 0 0 0 1 0 1 0 1 0 0 0 1 1 1 1];C = categorical(A,[1 0 NaN],{'yes','no','undecided'})
C = 1x27 categorical no no yes yes yes no no no no undecided undecided yes no no no yes no yes no yes no no no yes yes yes yes
Determine the number of elements that fall into each category.
[N,Categories] = histcounts(C)
N = 1×3 11 14 2
Categories = 1x3 cell {'yes'} {'no'} {'undecided'}
Input Arguments
collapse all
X
— Data to distribute among bins
vector | matrix | multidimensional array
Data to distribute among bins, specified as a vector, matrix,or multidimensional array. If X
is not a vector,then histcounts
treats it as a single column vector, X(:)
.
histcounts
ignores all NaN
values.Similarly, histcounts
ignores Inf
and -Inf
valuesunless the bin edges explicitly specify Inf
or -Inf
asa bin edge.
Data Types: single
| double
| int8
| int16
| int32
| int64
| uint8
| uint16
| uint32
| uint64
| logical
| datetime
| duration
C
— Categorical data
categorical array
Categorical data, specified as a categorical array. histcounts
ignoresundefined categorical values.
Data Types: categorical
nbins
— Number of bins
positive integer
Number of bins, specified as a positive integer. If you do notspecify nbins
, then histcounts
automaticallycalculates how many bins to use based on the values in X
.
Example: [N,edges] = histcounts(X,15)
uses15 bins.
edges
— Bin edges
vector
Bin edges, specified as a vector. The first vector element specifies the leading edge of the first bin. The last element specifies the trailing edge of the last bin. The trailing edge is only included for the last bin.
For datetime and duration data, edges
mustbe a datetime or duration vector in monotonically increasing order.
Categories
— Categories included in count
all categories (default) | string vector | cell vector of character vectors | pattern
scalar | categorical vector
Categories included in count, specified as a string vector, cell vector of character vectors, pattern scalar, or categorical vector. By default, histcounts
uses a bin for each category in categorical array C
. Use Categories
to specify a unique subset of the categories instead.
Example: h = histcounts(C,["Large","Small"])
counts only the categorical data in the categories Large
and Small
.
Example: h = histcounts(C,"Y" + wildcardPattern)
counts categorical data in all the categories whose names begin with the letter Y
.
Data Types: string
| cell
| pattern
| categorical
Name-Value Arguments
Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN
, where Name
is the argument name and Value
is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.
Before R2021a, use commas to separate each name and value, and enclose Name
in quotes.
Example: [N,edges] = histcounts(X,'Normalization','probability')
normalizesthe bin counts in N
, such that sum(N)
is1.
Output Arguments
collapse all
N
— Bin counts
row vector
Bin counts, returned as a row vector.
edges
— Bin edges
vector
Bin edges, returned as a vector. The first element is the leading edge of the first bin. The last element is the trailing edge of the last bin.
bin
— Bin indices
array
Bin indices, returned as an array of the same size as X
.Each element in bin
describes which numbered bincontains the corresponding element in X
.
A value of 0
in bin
indicatesan element which does not belong to any of the bins (for example,a NaN
value).
Categories
— Categories included in count
cell vector of character vectors
Categories included in count, returned as a cell vector of charactervectors. Categories
contains the categories in C
thatcorrespond to each count in N
.
Tips
The behavior of
histcounts
issimilar to that of thediscretize
function. Usehistcounts
tofind the number of elements in each bin. On the other hand, usediscretize
tofind which bin each element belongs to (without counting).
Extended Capabilities
Tall Arrays
Calculate with arrays that have more rows than fit in memory.
Usage notes and limitations:
Some input options are not supported. The allowedoptions are:
'BinWidth'
'BinLimits'
'Normalization'
'BinMethod'
— The'auto'
and'scott'
binmethods are the same. The'fd'
bin method is notsupported.
The
Categories
input argument does not support pattern expressions.
For more information, see Tall Arrays.
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
Usage notes and limitations:
Code generation does not support sparse matrix inputs for this function.
If you do not supply bin edges, then code generationmight require variable-size arrays and dynamic memory allocation.
The
Categories
input argument does not support pattern expressions.The
Normalization
name-value argument does not support the'percentage'
option.
GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.
Usage notes and limitations:
Code generation does not support sparse matrix inputs for this function.
If you do not supply bin edges, then code generation might require variable-size arrays and dynamic memory allocation.
The
Categories
input argument does not support pattern expressions.
Thread-Based Environment
Run code in the background using MATLAB® backgroundPool
or accelerate code with Parallel Computing Toolbox™ ThreadPool
.
This function fully supports thread-based environments. For more information, see Run MATLAB Functions in Thread-Based Environment.
GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.
Usage notes and limitations:
64-bit integers are not supported.
For more information, see Run MATLAB Functions on a GPU (Parallel Computing Toolbox).
Version History
Introduced in R2014b
expand all
R2023b: Normalize using percentages
You can normalize histogram values as percentages by specifying the Normalization
name-value argument as 'percentage'
.
R2023a: Improved performance with small numeric and logical input data
The histcounts
function shows improved performance for numeric and logical data due to faster input parsing. The performance improvement is more significant when input parsing is a greater portion of the computation time. This situation occurs when the size of the data to distribute among bins is smaller than 2000 elements.
For example, this code calculates histogram bin counts for a 1000-element vector. The code is about 3x faster than in the previous release.
function timingHistcountsX = rand(1,1000);for k = 1:3e3 histcounts(X,"BinMethod","auto");endend
The approximate execution times are:
R2022b: 0.62 s
R2023a: 0.21 s
The code was timed on a Windows® 10, Intel® Xeon® CPU E5-1650 v4 @ 3.60 GHz test system using the timeit
function.
timeit(@timingHistcounts)
See Also
histogram | histogram2 | discretize | histcounts2 | kde
Topics
- Replace Discouraged Instances of hist and histc
Commande MATLAB
Vous avez cliqué sur un lien qui correspond à cette commande MATLAB:
Pour exécuter la commande, saisissez-la dans la fenêtre de commande de MATLAB. Les navigateurs web ne supportent pas les commandes MATLAB.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- Deutsch
- English
- Français
- United Kingdom (English)
Contact your local office