最有用的25個Matplotlib圖（含Python程式碼模板）

50個Matplotlib圖的彙編，在資料分析和視覺化中最有用。此列表允許您使用Python的Matplotlib和Seaborn庫選擇要顯示的視覺化物件。

1.關聯

2.偏差

3.排序

4.分佈

5.組成

6.變化

7.分組

# !pip install brewer2mpl
import

numpy

as

np

import

pandas

as

pd

import

matplotlib

as

mpl

import

matplotlib.pyplot

as

plt

import

seaborn

as

sns

import

warnings; warnings.filterwarnings(action=

'once'

)

large =

22

; med =

16

; small =

12

params = {

'axes.titlesize'

: large,

'legend.fontsize'

: med,

'figure.figsize'

: (

16

,

10

),

'axes.labelsize'

: med,

'axes.titlesize'

: med,

'xtick.labelsize'

: med,

'ytick.labelsize'

: med,

'figure.titlesize'

: large}

plt.rcParams.update(params)

plt.style.use(

'seaborn-whitegrid'

)

sns.set_style(

"white"

)

%matplotlib inline

# Version
print

(mpl.__version__)

#> 3.0.0
print

(sns.__version__)

#> 0.9.0

1. 散點圖

Scatteplot是用於研究兩個變數之間關係的經典和基本圖。如果資料中有多個組，則可能需要以不同顏色視覺化每個組。在Matplotlib，你可以方便地使用。

# Import dataset

midwest = pd.read_csv(

"https://raw.githubusercontent.com/selva86/datasets/master/midwest_filter.csv"

)

# Prepare Data
# Create as many colors as there are unique midwest['category']

categories = np.unique(midwest[

'category'

])

colors = [plt.cm.tab10(i/

float

(len(categories)-1))

for

i

in

range(len(categories))]

# Draw Plot for Each Category

plt.figure(figsize=(16, 10), dpi= 80, facecolor=

'w'

, edgecolor=

'k'

)

for

i, category

in

enumerate(categories):

plt.scatter(

'area'

,

'poptotal'

,

data=midwest.loc[midwest.category==category, :],

s=20, c=colors[i], label=str(category))

# Decorations

plt.gca().

set

(xlim=(0.0, 0.1), ylim=(0, 90000),

xlabel=

'Area'

, ylabel=

'Population'

)

plt.xticks(fontsize=12); plt.yticks(fontsize=12)

plt.title(

"Scatterplot of Midwest Area vs Population"

, fontsize=22)

plt.legend(fontsize=12)

plt.show()

2. 帶邊界的氣泡圖

有時，您希望在邊界內顯示一組點以強調其重要性。在此示例中，您將從應該被環繞的資料幀中獲取記錄，並將其傳遞給下面的程式碼中描述的記錄。encircle()

from

matplotlib

import

patches

from

scipy.spatial

import

ConvexHull

import

warnings; warnings.simplefilter(

'ignore'

)

sns.set_style(

"white"

)

# Step 1: Prepare Data

midwest = pd.read_csv(

"https://raw.githubusercontent.com/selva86/datasets/master/midwest_filter.csv"

)

# As many colors as there are unique midwest['category']

categories = np.unique(midwest[

'category'

])

colors = [plt.cm.tab10(i/float(len(categories)

-1

))

for

i

in

range(len(categories))]

# Step 2: Draw Scatterplot with unique color for each category

fig = plt.figure(figsize=(

16

,

10

), dpi=

80

, facecolor=

'w'

, edgecolor=

'k'

)

for

i, category

in

enumerate(categories):

plt.scatter(

'area'

,

'poptotal'

, data=midwest.loc[midwest.category==category, :], s=

'dot_size'

, c=colors[i], label=str(category), edgecolors=

'black'

, linewidths=

.5

)

# Step 3: Encircling
# https://stackoverflow.com/questions/44575681/how-do-i-encircle-different-data-sets-in-scatter-plot
defencircle(x,y, ax=None, **kw):
ifnot

ax: ax=plt.gca()

p = np.c_[x,y]

hull = ConvexHull(p)

poly = plt.Polygon(p[hull.vertices,:], **kw)

ax.add_patch(poly)

# Select data to be encircled

midwest_encircle_data = midwest.loc[midwest.state==

'IN'

, :]

# Draw polygon surrounding vertices

encircle(midwest_encircle_data.area, midwest_encircle_data.poptotal, ec=

"k"

, fc=

"gold"

, alpha=

0.1

)

encircle(midwest_encircle_data.area, midwest_encircle_data.poptotal, ec=

"firebrick"

, fc=

"none"

, linewidth=

1.5

)

# Step 4: Decorations

plt.gca().set(xlim=(

0.0

,

0.1

), ylim=(

0

,

90000

),

xlabel=

'Area'

, ylabel=

'Population'

)

plt.xticks(fontsize=

12

); plt.yticks(fontsize=

12

)

plt.title(

"Bubble Plot with Encircling"

, fontsize=

22

)

plt.legend(fontsize=

12

)

plt.show()

3. 帶線性迴歸最佳擬合線的散點圖

如果你想了解兩個變數如何相互改變，那麼最合適的線就是要走的路。下圖顯示了資料中各組之間最佳擬合線的差異。要停用分組並僅為整個資料集繪製一條最佳擬合線，請從下面的呼叫中刪除該引數。

# Import Data

df = pd.read_csv(

"https://raw.githubusercontent.com/selva86/datasets/master/mpg_ggplot2.csv"

)

df_select = df.loc[df.cyl.isin([4,8]), :]

# Plot

sns.set_style(

"white"

)

gridobj = sns.lmplot(x=

"displ"

, y=

"hwy"

, hue=

"cyl"

, data=df_select,

height=7, aspect=1.6, robust=True, palette='tab10',

scatter_kws=dict(s=60, linewidths=.7, edgecolors='black'))

# Decorations

gridobj.set(xlim=(0.5, 7.5), ylim=(0, 50))

plt.title(

"Scatterplot with line of best fit grouped by number of cylinders"

, fontsize=20)

每個迴歸線都在自己的列中

或者，您可以在其自己的列中顯示每個組的最佳擬合線。你可以透過在裡面設定引數來實現這一點。

# Import Data

df = pd.read_csv(

"https://raw.githubusercontent.com/selva86/datasets/master/mpg_ggplot2.csv"

)

df_select = df.loc[df.cyl.isin([4,8]), :]

# Each line in its own column

sns.set_style(

"white"

)

gridobj = sns.lmplot(x=

"displ"

, y=

"hwy"

,

data=df_select,

height=7,

robust=True,

palette='Set1',

col=

"cyl"

,

scatter_kws=dict(s=60, linewidths=.7, edgecolors='black'))

# Decorations

gridobj.set(xlim=(0.5, 7.5), ylim=(0, 50))

plt.show()

4. 抖動圖

通常，多個數據點具有完全相同的X和Y值。結果，多個點相互繪製並隱藏。為避免這種情況，請稍微抖動點，以便您可以直觀地看到它們。這很方便使用

# Import Data

df = pd.read_csv(

"https://raw.githubusercontent.com/selva86/datasets/master/mpg_ggplot2.csv"

)

# Draw Stripplot

fig, ax = plt.subplots(figsize=(16,10), dpi= 80)

sns.stripplot(df.cty, df.hwy, jitter=0.25, size=8, ax=ax, linewidth=.5)

# Decorations

plt.title('Use jittered plots to avoid overlapping of points', fontsize=22)

plt.show()

5. 計數圖

避免點重疊問題的另一個選擇是增加點的大小，這取決於該點中有多少點。因此，點的大小越大，周圍的點的集中度就越大。

# Import Data
df

= pd.read_csv(

"https://raw.githubusercontent.com/selva86/datasets/master/mpg_ggplot2.csv"

)

df_counts = df.groupby([

'hwy'

,

'cty'

]).size().reset_index(name=

'counts'

)

# Draw Stripplot

fig, ax = plt.subplots(figsize=(

16

,

10

), dpi=

80

)

sns.stripplot(df_counts.cty, df_counts.hwy, size=df_counts.counts*

2

, ax=ax)

# Decorations

plt.title(

'Counts Plot – Size of circle is bigger as more points overlap'

, fontsize=

22

)

plt.show()

6. 邊緣直方圖

邊緣直方圖具有沿X和Y軸變數的直方圖。這用於視覺化X和Y之間的關係以及單獨的X和Y的單變數分佈。該圖如果經常用於探索性資料分析（EDA）。

# Import Data

df = pd.read_csv(

"https://raw.githubusercontent.com/selva86/datasets/master/mpg_ggplot2.csv"

)

# Create Fig and gridspec

fig = plt.figure(figsize=(

16

,

10

), dpi=

80

)

grid = plt.GridSpec(

4

,

4

, hspace=

0

.

5

, wspace=

0

.

2

)

# Define the axes

ax_main = fig.add_subplot(grid[

:-1

,

:-1

])

ax_right = fig.add_subplot(grid[

:-1

, –

1

], xticklabels=[], yticklabels=[])

ax_bottom = fig.add_subplot(grid[-

1

,

0:-1

], xticklabels=[], yticklabels=[])

# Scatterplot on main ax

ax_main.scatter(

'displ'

,

'hwy'

, s=df.cty*

4

, c=df.manufacturer.astype(

'category'

).cat.codes, alpha=.

9

, data=df, cmap=

"tab10"

, edgecolors=

'gray'

, linewidths=.

5

)

# histogram on the right

ax_bottom.hist(df.displ,

40

, histtype=

'stepfilled'

, orientation=

'vertical'

, color=

'deeppink'

)

ax_bottom.invert_yaxis()

# histogram in the bottom

ax_right.hist(df.hwy,

40

, histtype=

'stepfilled'

, orientation=

'horizontal'

, color=

'deeppink'

)

# Decorations

ax_main.set(title=

'Scatterplot with Histograms

displ vs hwy'

, xlabel=

'displ'

, ylabel=

'hwy'

)

ax_main.title.set_fontsize(

20

)

for

item

in

([ax_main.xaxis.label, ax_main.yaxis.label] + ax_main.get_xticklabels() + ax_main.get_yticklabels()):

item.set_fontsize(

14

)

xlabels = ax_main.get_xticks().tolist()

ax_main.set_xticklabels(xlabels)

plt.show()

7.邊緣箱形圖

邊緣箱圖與邊緣直方圖具有相似的用途。然而，箱線圖有助於精確定位X和Y的中位數，第25和第75百分位數。

# Import Data

df = pd.read_csv(

"https://raw.githubusercontent.com/selva86/datasets/master/mpg_ggplot2.csv"

)

# Create Fig and gridspec

fig = plt.figure(figsize=(

16

,

10

), dpi=

80

)

grid = plt.GridSpec(

4

,

4

, hspace=

0

.

5

, wspace=

0

.

2

)

# Define the axes

ax_main = fig.add_subplot(grid[

:-1

,

:-1

])

ax_right = fig.add_subplot(grid[

:-1

, –

1

], xticklabels=[], yticklabels=[])

ax_bottom = fig.add_subplot(grid[-

1

,

0:-1

], xticklabels=[], yticklabels=[])

# Scatterplot on main ax

ax_main.scatter(

'displ'

,

'hwy'

, s=df.cty*

5

, c=df.manufacturer.astype(

'category'

).cat.codes, alpha=.

9

, data=df, cmap=

"Set1"

, edgecolors=

'black'

, linewidths=.

5

)

# Add a graph in each part

sns.boxplot(df.hwy, ax=ax_right, orient=

"v"

)

sns.boxplot(df.displ, ax=ax_bottom, orient=

"h"

)

# Decorations ——————
# Remove x axis name for the boxplot

ax_bottom.set(xlabel=

''

)

ax_right.set(ylabel=

''

)

# Main Title, Xlabel and YLabel

ax_main.set(title=

'Scatterplot with Histograms

displ vs hwy'

, xlabel=

'displ'

, ylabel=

'hwy'

)

# Set font size of different components

ax_main.title.set_fontsize(

20

)

for

item

in

([ax_main.xaxis.label, ax_main.yaxis.label] + ax_main.get_xticklabels() + ax_main.get_yticklabels()):

item.set_fontsize(

14

)

plt.show()

8. 相關圖

Correlogram用於直觀地檢視給定資料幀（或2D陣列）中所有可能的數值變數對之間的相關度量。

# Import Dataset
df

= pd.read_csv(

"https://github.com/selva86/datasets/raw/master/mtcars.csv"

)

# Plot

plt.figure(figsize=(

12

,

10

), dpi=

80

)

sns.heatmap(df.corr(), xticklabels=df.corr().columns, yticklabels=df.corr().columns, cmap=

'RdYlGn'

, center=

0

, annot=True)

# Decorations

plt.title(

'Correlogram of mtcars'

, fontsize=

22

)

plt.xticks(fontsize=

12

)

plt.yticks(fontsize=

12

)

plt.show()

9. 矩陣圖

成對圖是探索性分析中的最愛，以理解所有可能的數字變數對之間的關係。它是雙變數分析的必備工具。

# Load Dataset

df = sns.load_dataset('iris')

# Plot

plt.figure(figsize=(10,8), dpi= 80)

sns.pairplot(df, kind=

"scatter"

, hue=

"species"

, plot_kws=dict(s=80, edgecolor=

"white"

, linewidth=2.5))

plt.show()

# Load Dataset

df = sns.load_dataset('iris')

# Plot

plt.figure(figsize=(10,8), dpi= 80)

sns.pairplot(df, kind=

"reg"

, hue=

"species"

)

plt.show()

偏差

10. 發散型條形圖

如果您想根據單個指標檢視專案的變化情況，並可視化此差異的順序和數量，那麼發散條是一個很好的工具。它有助於快速區分資料中組的效能，並且非常直觀，並且可以立即傳達這一點。

# Prepare Data

df = pd.read_csv(

"https://github.com/selva86/datasets/raw/master/mtcars.csv"

)

x = df.loc[:, [

'mpg'

]]

df[

'mpg_z'

] = (x – x.mean())/x.std()

df[

'colors'

] = [

'red'if

x <

0else'green'for

x

in

df[

'mpg_z'

]]

df.sort_values(

'mpg_z'

, inplace=

True

)

df.reset_index(inplace=

True

)

# Draw plot

plt.figure(figsize=(

14

,

10

), dpi=

80

)

plt.hlines(y=df.index, xmin=

0

, xmax=df.mpg_z, color=df.colors, alpha=

0.4

, linewidth=

5

)

# Decorations

plt.gca().set(ylabel=

'$Model$'

, xlabel=

'$Mileage$'

)

plt.yticks(df.index, df.cars, fontsize=

12

)

plt.title(

'Diverging Bars of Car Mileage'

, fontdict={

'size'

:

20

})

plt.grid(linestyle=

'–'

, alpha=

0.5

)

plt.show()

11. 發散型文字

分散的文字類似於發散條，如果你想以一種漂亮和可呈現的方式顯示圖表中每個專案的價值，它更喜歡。

# Prepare Data

df = pd.read_csv(

"https://github.com/selva86/datasets/raw/master/mtcars.csv"

)

x = df.loc[:, [

'mpg'

]]

df[

'mpg_z'

] = (x – x.mean())/x.std()

df[

'colors'

] = [

'red'if

x <

0else'green'for

x

in

df[

'mpg_z'

]]

df.sort_values(

'mpg_z'

, inplace=

True

)

df.reset_index(inplace=

True

)

# Draw plot

plt.figure(figsize=(

14

,

14

), dpi=

80

)

plt.hlines(y=df.index, xmin=

0

, xmax=df.mpg_z)

for

x, y, tex

in

zip(df.mpg_z, df.index, df.mpg_z):

t = plt.text(x, y, round(tex,

2

), horizontalalignment=

'right'if

x <

0else'left'

,

verticalalignment=

'center'

, fontdict={

'color'

:

'red'if

x <

0else'green'

,

'size'

:

14

})

# Decorations

plt.yticks(df.index, df.cars, fontsize=

12

)

plt.title(

'Diverging Text Bars of Car Mileage'

, fontdict={

'size'

:

20

})

plt.grid(linestyle=

'–'

, alpha=

0.5

)

plt.xlim(

-2.5

,

2.5

)

plt.show()

12. 發散型包點圖

發散點圖也類似於發散條。然而，與發散條相比，條的不存在減少了組之間的對比度和差異。

# Prepare Data

df = pd.read_csv(

"https://github.com/selva86/datasets/raw/master/mtcars.csv"

)

x = df.loc[:, [

'mpg'

]]

df[

'mpg_z'

] = (x – x.mean())/x.std()

df[

'colors'

] = [

'red'if

x <

0else'darkgreen'for

x

in

df[

'mpg_z'

]]

df.sort_values(

'mpg_z'

, inplace=

True

)

df.reset_index(inplace=

True

)

# Draw plot

plt.figure(figsize=(

14

,

16

), dpi=

80

)

plt.scatter(df.mpg_z, df.index, s=

450

, alpha=

.6

, color=df.colors)

for

x, y, tex

in

zip(df.mpg_z, df.index, df.mpg_z):

t = plt.text(x, y, round(tex,

1

), horizontalalignment=

'center'

,

verticalalignment=

'center'

, fontdict={

'color'

:

'white'

})

# Decorations
# Lighten borders

plt.gca().spines[

"top"

].set_alpha(

.3

)

plt.gca().spines[

"bottom"

].set_alpha(

.3

)

plt.gca().spines[

"right"

].set_alpha(

.3

)

plt.gca().spines[

"left"

].set_alpha(

.3

)

plt.yticks(df.index, df.cars)

plt.title(

'Diverging Dotplot of Car Mileage'

, fontdict={

'size'

:

20

})

plt.xlabel(

'$Mileage$'

)

plt.grid(linestyle=

'–'

, alpha=

0.5

)

plt.xlim(

-2.5

,

2.5

)

plt.show()

13. 帶標記的發散型棒棒糖圖

帶標記的棒棒糖透過強調您想要引起注意的任何重要資料點並在圖表中適當地給出推理，提供了一種視覺化分歧的靈活方式。

# Prepare Data

df = pd.read_csv(

"https://github.com/selva86/datasets/raw/master/mtcars.csv"

)

x = df.loc[:, [

'mpg'

]]

df[

'mpg_z'

] = (x – x.mean())/x.std()

df[

'colors'

] =

'black'

# color fiat differently

df.loc[df.cars ==

'Fiat X1-9'

,

'colors'

] =

'darkorange'

df.sort_values(

'mpg_z'

, inplace=

True

)

df.reset_index(inplace=

True

)

# Draw plot
import

matplotlib.patches

as

patches

plt.figure(figsize=(

14

,

16

), dpi=

80

)

plt.hlines(y=df.index, xmin=

0

, xmax=df.mpg_z, color=df.colors, alpha=

0.4

, linewidth=

1

)

plt.scatter(df.mpg_z, df.index, color=df.colors, s=[

600if

x ==

'Fiat X1-9'else300for

x

in

df.cars], alpha=

0.6

)

plt.yticks(df.index, df.cars)

plt.xticks(fontsize=

12

)

# Annotate

plt.annotate(

'Mercedes Models'

, xy=(

0.0

,

11.0

), xytext=(

1.0

,

11

), xycoords=

'data'

,

fontsize=

15

, ha=

'center'

, va=

'center'

,

bbox=dict(boxstyle=

'square'

, fc=

'firebrick'

),

arrowprops=dict(arrowstyle=

'-[, widthB=2.0, lengthB=1.5'

, lw=

2.0

, color=

'steelblue'

), color=

'white'

)

# Add Patches

p1 = patches.Rectangle((

-2.0

,

-1

), width=

.3

, height=

3

, alpha=

.2

, facecolor=

'red'

)

p2 = patches.Rectangle((

1.5

,

27

), width=

.8

, height=

5

, alpha=

.2

, facecolor=

'green'

)

plt.gca().add_patch(p1)

plt.gca().add_patch(p2)

# Decorate

plt.title(

'Diverging Bars of Car Mileage'

, fontdict={

'size'

:

20

})

plt.grid(linestyle=

'–'

, alpha=

0.5

)

plt.show()

14.面積圖

透過對軸和線之間的區域進行著色，區域圖不僅強調峰值和低谷，而且還強調高點和低點的持續時間。高點持續時間越長，線下面積越大。

import

numpy

as

np

import

pandas

as

pd

# Prepare Data

df = pd.read_csv(

"https://github.com/selva86/datasets/raw/master/economics.csv"

, parse_dates=[

'date'

]).head(

100

)

x = np.arange(df.shape[

0

])

y_returns = (df.psavert.diff().fillna(

0

)/df.psavert.shift(

1

)).fillna(

0

) *

100

# Plot

plt.figure(figsize=(

16

,

10

), dpi=

80

)

plt.fill_between(x[

1

:], y_returns[

1

:],

0

, where=y_returns[

1

:] >=

0

, facecolor=

'green'

, interpolate=

True

, alpha=

0.7

)

plt.fill_between(x[

1

:], y_returns[

1

:],

0

, where=y_returns[

1

:] <=

0

, facecolor=

'red'

, interpolate=

True

, alpha=

0.7

)

# Annotate

plt.annotate(

'Peak

1975'

, xy=(

94.0

,

21.0

), xytext=(

88.0

,

28

),

bbox=dict(boxstyle=

'square'

, fc=

'firebrick'

),

arrowprops=dict(facecolor=

'steelblue'

, shrink=

0.05

), fontsize=

15

, color=

'white'

)

# Decorations

xtickvals = [str(m)[:

3

].upper()+

"-"

+str(y)

for

y,m

in

zip(df.date.dt.year, df.date.dt.month_name())]

plt.gca().set_xticks(x[::

6

])

plt.gca().set_xticklabels(xtickvals[::

6

], rotation=

90

, fontdict={

'horizontalalignment'

:

'center'

,

'verticalalignment'

:

'center_baseline'

})

plt.ylim(

-35

,

35

)

plt.xlim(

1

,

100

)

plt.title(

"Month Economics Return %"

, fontsize=

22

)

plt.ylabel(

'Monthly returns %'

)

plt.grid(alpha=

0.5

)

plt.show()

排序

15. 有序條形圖

有序條形圖有效地傳達了專案的排名順序。但是，在圖表上方新增度量標準的值，使用者可以從圖表本身獲取精確資訊。

# Prepare Data

df_raw = pd.read_csv(

"https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv"

)

df = df_raw[[

'cty'

,

'manufacturer'

]].groupby(

'manufacturer'

).apply(

lambda

x: x.mean())

df.sort_values(

'cty'

, inplace=

True

)

df.reset_index(inplace=

True

)

# Draw plotimport

matplotlib.patches

as

patches

fig, ax = plt.subplots(figsize=(

16

,

10

), facecolor=

'white'

, dpi=

80

)

ax.vlines(x=df.index, ymin=

0

, ymax=df.cty, color=

'firebrick'

, alpha=

0.7

, linewidth=

20

)

# Annotate Textfor

i, cty

in

enumerate(df.cty):

ax.text(i, cty+

0.5

, round(cty,

1

), horizontalalignment=

'center'

)

# Title, Label, Ticks and Ylim

ax.set_title(

'Bar Chart for Highway Mileage'

, fontdict={

'size'

:

22

})

ax.set(ylabel=

'Miles Per Gallon'

, ylim=(

0

,

30

))

plt.xticks(df.index, df.manufacturer.str.upper(), rotation=

60

, horizontalalignment=

'right'

, fontsize=

12

)

# Add patches to color the X axis labels

p1 = patches.Rectangle((

.57

,

-0.005

), width=

.33

, height=

.13

, alpha=

.1

, facecolor=

'green'

, transform=fig.transFigure)

p2 = patches.Rectangle((

.124

,

-0.005

), width=

.446

, height=

.13

, alpha=

.1

, facecolor=

'red'

, transform=fig.transFigure)

fig.add_artist(p1)

fig.add_artist(p2)

plt.show()

16. 棒棒糖圖

棒棒糖圖表以一種視覺上令人愉悅的方式提供與有序條形圖類似的目的。

# Prepare Data

df_raw = pd.read_csv(

"https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv"

)

df = df_raw[[

'cty'

,

'manufacturer'

]].groupby(

'manufacturer'

).apply(

lambda

x: x.mean())

df.sort_values(

'cty'

, inplace=

True

)

df.reset_index(inplace=

True

)

# Draw plot

fig, ax = plt.subplots(figsize=(

16

,

10

), dpi=

80

)

ax.vlines(x=df.index, ymin=

0

, ymax=df.cty, color=

'firebrick'

, alpha=

0.7

, linewidth=

2

)

ax.scatter(x=df.index, y=df.cty, s=

75

, color=

'firebrick'

, alpha=

0.7

)

# Title, Label, Ticks and Ylim

ax.set_title(

'Lollipop Chart for Highway Mileage'

, fontdict={

'size'

:

22

})

ax.set_ylabel(

'Miles Per Gallon'

)

ax.set_xticks(df.index)

ax.set_xticklabels(df.manufacturer.str.upper(), rotation=

60

, fontdict={

'horizontalalignment'

:

'right'

,

'size'

:

12

})

ax.set_ylim(

0

,

30

)

# Annotate
for

row

in

df.itertuples():

ax.text(row.Index, row.cty+

.5

, s=round(row.cty,

2

), horizontalalignment=

'center'

, verticalalignment=

'bottom'

, fontsize=

14

)

plt.show()

17. 包點圖

點圖表傳達了專案的排名順序。由於它沿水平軸對齊，因此您可以更容易地看到點彼此之間的距離。

# Prepare Data

df_raw = pd.read_csv(

"https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv"

)

df = df_raw[[

'cty'

,

'manufacturer'

]].groupby(

'manufacturer'

).apply(

lambda

x: x.mean())

df.sort_values(

'cty'

, inplace=

True

)

df.reset_index(inplace=

True

)

# Draw plot

fig, ax = plt.subplots(figsize=(

16

,

10

), dpi=

80

)

ax.hlines(y=df.index, xmin=

11

, xmax=

26

, color=

'gray'

, alpha=

0.7

, linewidth=

1

, linestyles=

'dashdot'

)

ax.scatter(y=df.index, x=df.cty, s=

75

, color=

'firebrick'

, alpha=

0.7

)

# Title, Label, Ticks and Ylim

ax.set_title(

'Dot Plot for Highway Mileage'

, fontdict={

'size'

:

22

})

ax.set_xlabel(

'Miles Per Gallon'

)

ax.set_yticks(df.index)

ax.set_yticklabels(df.manufacturer.str.title(), fontdict={

'horizontalalignment'

:

'right'

})

ax.set_xlim(

10

,

27

)

plt.show()

18. 坡度圖

斜率圖最適合比較給定人/專案的“之前”和“之後”位置。

import matplotlib.lines

as

mlines

# Import Data

df = pd.read_csv(

"https://raw.githubusercontent.com/selva86/datasets/master/gdppercap.csv"

)

left_label = [str(c) +

', '

+ str(round(y))

for

c,

y

inzip

(

df.continent, df['1952']

)]

right_label

= [str(c) +

', '

+ str(round(y))

for

c,

y

inzip

(

df.continent, df['1957']

)]

klass

= [

'red'if

(y1-y2) <

0else'green'for

y1,

y2

inzip

(

df['1952'], df['1957']

)]

# draw line

# https:

//stackoverflow.com/questions/36470343/how-to-draw-a-line-with-matplotlib/36479941

def

newline

(

p1, p2, color='black'

):

ax

= plt.gca()

l = mlines.Line2D([p1[

0

],p2[

0

]], [p1[

1

],p2[

1

]], color=

'red'if

p1[

1

]-p2[

1

] >

0else'green'

, marker=

'o'

, markersize=

6

)

ax.add_line(l)

return

l

fig, ax = plt.subplots(

1

,

1

,figsize=(

14

,

14

), dpi=

80

)

# Vertical Lines

ax.vlines(x=

1

, ymin=

500

, ymax=

13000

, color=

'black'

, alpha=

0.7

, linewidth=

1

, linestyles=

'dotted'

)

ax.vlines(x=

3

, ymin=

500

, ymax=

13000

, color=

'black'

, alpha=

0.7

, linewidth=

1

, linestyles=

'dotted'

)

# Points

ax.scatter(y=df[

'1952'

], x=np.repeat(

1

, df.shape[

0

]), s=

10

, color=

'black'

, alpha=

0.7

)

ax.scatter(y=df[

'1957'

], x=np.repeat(

3

, df.shape[

0

]), s=

10

, color=

'black'

, alpha=

0.7

)

# Line Segmentsand Annotation
for

p1, p2,

c

inzip

(

df['1952'], df['1957'], df['continent']

):

newline

(

[1,p1], [3,p2]

)

ax.

text

(

1-0.05, p1, c + ', ' + str(round(p1

)), horizontalalignment

=

'right'

, verticalalignment=

'center'

, fontdict={

'size'

:

14

})

ax.text(

3

+

0.05

, p2, c +

', '

+ str(round(p2)), horizontalalignment=

'left'

, verticalalignment=

'center'

, fontdict={

'size'

:

14

})

# 'Before' and 'After' Annotations

ax.text(

1-0.05

,

13000

,

'BEFORE'

, horizontalalignment=

'right'

, verticalalignment=

'center'

, fontdict={

'size'

:

18

,

'weight'

:

700

})

ax.text(

3

+

0.05

,

13000

,

'AFTER'

, horizontalalignment=

'left'

, verticalalignment=

'center'

, fontdict={

'size'

:

18

,

'weight'

:

700

})

# Decoration

ax.set_title(

"Slopechart: Comparing GDP Per Capita between 1952 vs 1957"

, fontdict={

'size'

:

22

})

ax.

set

(xlim=(

0

,

4

), ylim=(

0

,

14000

), ylabel=

'Mean GDP Per Capita'

)

ax.set_xticks([

1

,

3

])

ax.set_xticklabels([

"1952"

,

"1957"

])

plt.yticks(np.arange(

500

,

13000

,

2000

), fontsize=

12

)

# Lighten borders

plt.gca().spines[

"top"

].set_alpha(

.0

)

plt.gca().spines[

"bottom"

].set_alpha(

.0

)

plt.gca().spines[

"right"

].set_alpha(

.0

)

plt.gca().spines[

"left"

].set_alpha(

.0

)

plt.show()

19. 啞鈴圖

啞鈴圖傳達各種專案的“前”和“後”位置以及專案的排序。如果您想要將特定專案/計劃對不同物件的影響視覺化，那麼它非常有用。

import

matplotlib.lines

as

mlines

# Import Data

df = pd.read_csv(

"https://raw.githubusercontent.com/selva86/datasets/master/health.csv"

)

df.sort_values(

'pct_2014'

, inplace=

True

)

df.reset_index(inplace=

True

)

# Func to draw line segment
defnewline(p1, p2, color='black'):

ax = plt.gca()

l = mlines.Line2D([p1[

0

],p2[

0

]], [p1[

1

],p2[

1

]], color=

'skyblue'

)

ax.add_line(l)

return

l

# Figure and Axes

fig, ax = plt.subplots(

1

,

1

,figsize=(

14

,

14

), facecolor=

'#f7f7f7'

, dpi=

80

)

# Vertical Lines

ax.vlines(x=

.05

, ymin=

0

, ymax=

26

, color=

'black'

, alpha=

1

, linewidth=

1

, linestyles=

'dotted'

)

ax.vlines(x=

.10

, ymin=

0

, ymax=

26

, color=

'black'

, alpha=

1

, linewidth=

1

, linestyles=

'dotted'

)

ax.vlines(x=

.15

, ymin=

0

, ymax=

26

, color=

'black'

, alpha=

1

, linewidth=

1

, linestyles=

'dotted'

)

ax.vlines(x=

.20

, ymin=

0

, ymax=

26

, color=

'black'

, alpha=

1

, linewidth=

1

, linestyles=

'dotted'

)

# Points

ax.scatter(y=df[

'index'

], x=df[

'pct_2013'

], s=

50

, color=

'#0e668b'

, alpha=

0.7

)

ax.scatter(y=df[

'index'

], x=df[

'pct_2014'

], s=

50

, color=

'#a3c4dc'

, alpha=

0.7

)

# Line Segments
for

i, p1, p2

in

zip(df[

'index'

], df[

'pct_2013'

], df[

'pct_2014'

]):

newline([p1, i], [p2, i])

# Decoration

ax.set_facecolor(

'#f7f7f7'

)

ax.set_title(

"Dumbell Chart: Pct Change – 2013 vs 2014"

, fontdict={

'size'

:

22

})

ax.set(xlim=(

0

,

.25

), ylim=(

-1

,

27

), ylabel=

'Mean GDP Per Capita'

)

ax.set_xticks([

.05

,

.1

,

.15

,

.20

])

ax.set_xticklabels([

'5%'

,

'15%'

,

'20%'

,

'25%'

])

ax.set_xticklabels([

'5%'

,

'15%'

,

'20%'

,

'25%'

])

plt.show()

分配

20. 連續變數的直方圖

直方圖顯示給定變數的頻率分佈。下面的表示基於分類變數對頻率條進行分組，從而更好地瞭解連續變數和串聯變數。

# Import Data

df = pd.read_csv(

"https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv"

)

# Prepare data

x_var =

'displ'

groupby_var =

'class'

df_agg = df.loc[:, [x_var, groupby_var]].groupby(groupby_var)

vals = [df[x_var].values.tolist()

for

i, df

in

df_agg]

# Draw

plt.figure(figsize=(

16

,

9

), dpi=

80

)

colors = [plt.cm.Spectral(i/

float

(len(vals)

-1

))

for

i

inrange

(

len(vals

))]

n, bins, patches

= plt.hist(vals,

30

, stacked=True, density=False, color=colors[:len(vals)])

# Decoration

plt.legend({

group

:col

forgroup

,

col

inzip

(

np.unique(df[groupby_var]

).

tolist

(

), colors[:

len

(

vals

)])})

plt.

title

(

f"Stacked Histogram of ${x_var}$ colored by ${groupby_var}$", fontsize=22

)

plt.

xlabel

(

x_var

)

plt.

ylabel

(

"Frequency"

)

plt.

ylim

(

0, 25

)

plt.

xticks

(

ticks=bins[::3], labels=[round(b,1

)

for

b

in

bins[::3]])

plt.

show

(

)

21. 型別變數的直方圖

分類變數的直方圖顯示該變數的頻率分佈。透過對條形圖進行著色，您可以將分佈與表示顏色的另一個分類變數相關聯。

# Import Data

df = pd.read_csv(

"https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv"

)

# Prepare data

x_var =

'manufacturer'

groupby_var =

'class'

df_agg = df.loc[:, [x_var, groupby_var]].groupby(groupby_var)

vals = [df[x_var].values.tolist()

for

i, df

in

df_agg]

# Draw

plt.figure(figsize=(

16

,

9

), dpi=

80

)

colors = [plt.cm.Spectral(i/

float

(len(vals)

-1

))

for

i

inrange

(

len(vals

))]

n, bins, patches

= plt.hist(vals, df[x_var].unique().__len__(), stacked=True, density=False, color=colors[:len(vals)])

# Decoration

plt.legend({

group

:col

forgroup

,

col

inzip

(

np.unique(df[groupby_var]

).

tolist

(

), colors[:

len

(

vals

)])})

plt.

title

(

f"Stacked Histogram of ${x_var}$ colored by ${groupby_var}$", fontsize=22

)

plt.

xlabel

(

x_var

)

plt.

ylabel

(

"Frequency"

)

plt.

ylim

(

0, 40

)

plt.

xticks

(

ticks=bins, labels=np.unique(df[x_var]

).

tolist

(

), rotation

=

90

, horizontalalignment=

'left'

)

plt.show()

22. 密度圖

密度圖是一種常用工具，視覺化連續變數的分佈。透過“響應”變數對它們進行分組，您可以檢查X和Y之間的關係。以下情況，如果出於代表性目的來描述城市裡程的分佈如何隨著汽缸數的變化而變化。

# Import Data

df = pd.read_csv(

"https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv"

)

# Draw Plot

plt.figure(figsize=(

16

,

10

), dpi=

80

)

sns.kdeplot(df.loc[df[

'cyl'

] ==

4

,

"cty"

], shade=

True

, color=

"g"

, label=

"Cyl=4"

, alpha=

.7

)

sns.kdeplot(df.loc[df[

'cyl'

] ==

5

,

"cty"

], shade=

True

, color=

"deeppink"

, label=

"Cyl=5"

, alpha=

.7

)

sns.kdeplot(df.loc[df[

'cyl'

] ==

6

,

"cty"

], shade=

True

, color=

"dodgerblue"

, label=

"Cyl=6"

, alpha=

.7

)

sns.kdeplot(df.loc[df[

'cyl'

] ==

8

,

"cty"

], shade=

True

, color=

"orange"

, label=

"Cyl=8"

, alpha=

.7

)

# Decoration

plt.title(

'Density Plot of City Mileage by n_Cylinders'

, fontsize=

22

)

plt.legend()

23. 直方密度線圖

帶有直方圖的密度曲線將兩個圖表傳達的集體資訊彙集在一起，這樣您就可以將它們放在一個圖形而不是兩個圖形中。

# Import Data

df = pd.read_csv(

"https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv"

)

# Draw Plot

plt.figure(figsize=(13,10), dpi= 80)

sns.distplot(df.loc[df[

'class'

] ==

'compact'

,

"cty"

], color=

"dodgerblue"

, label=

"Compact"

, hist_kws={

'alpha'

:.7}, kde_kws={

'linewidth'

:3})

sns.distplot(df.loc[df[

'class'

] ==

'suv'

,

"cty"

], color=

"orange"

, label=

"SUV"

, hist_kws={

'alpha'

:.7}, kde_kws={

'linewidth'

:3})

sns.distplot(df.loc[df[

'class'

] ==

'minivan'

,

"cty"

], color=

"g"

, label=

"minivan"

, hist_kws={

'alpha'

:.7}, kde_kws={

'linewidth'

:3})

plt.ylim(0, 0.35)

# Decoration

plt.title(

'Density Plot of City Mileage by Vehicle Type'

, fontsize=22)

plt.legend()

plt.show()

24. Joy Plot

Joy Plot允許不同組的密度曲線重疊，這是一種視覺化相對於彼此的大量組的分佈的好方法。它看起來很悅目，並清楚地傳達了正確的資訊。它可以使用joypy基於的包來輕鬆構建matplotlib。

# !pip install joypy
# Import Data

mpg = pd.read_csv(

"https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv"

)

# Draw Plot

plt.figure(figsize=(

16

,

10

), dpi=

80

)

fig, axes = joypy.joyplot(mpg, column=[

'hwy'

,

'cty'

],

by

=

"class"

, ylim=

'own'

, figsize=(

14

,

10

))

# Decoration

plt.title(

'Joy Plot of City and Highway Mileage by Class'

, fontsize=

22

)

plt.show()

25. 分散式點圖

分佈點圖顯示按組分割的點的單變數分佈。點數越暗，該區域的資料點集中度越高。透過對中位數進行不同著色，組的真實定位立即變得明顯。

import

matplotlib.patches

as

mpatches

# Prepare Data

df_raw = pd.read_csv(

"https://github.com/selva86/datasets/raw/master/mpg_ggplot2.csv"

)

cyl_colors = {

4

:

'tab:red'

,

5

:

'tab:green'

,

6

:

'tab:blue'

,

8

:

'tab:orange'

}

df_raw[

'cyl_color'

] = df_raw.cyl.map(cyl_colors)

# Mean and Median city mileage by make

df = df_raw[[

'cty'

,

'manufacturer'

]].groupby(

'manufacturer'

).apply(

lambda

x: x.mean())

df.sort_values(

'cty'

, ascending=

False

, inplace=

True

)

df.reset_index(inplace=

True

)

df_median = df_raw[[

'cty'

,

'manufacturer'

]].groupby(

'manufacturer'

).apply(

lambda

x: x.median())

# Draw horizontal lines

fig, ax = plt.subplots(figsize=(

16

,

10

), dpi=

80

)

ax.hlines(y=df.index, xmin=

0

, xmax=

40

, color=

'gray'

, alpha=

0.5

, linewidth=

.5

, linestyles=

'dashdot'

)

# Draw the Dots
for

i, make

in

enumerate(df.manufacturer):

df_make = df_raw.loc[df_raw.manufacturer==make, :]

ax.scatter(y=np.repeat(i, df_make.shape[

0

]), x=

'cty'

, data=df_make, s=

75

, edgecolors=

'gray'

, c=

'w'

, alpha=

0.5

)

ax.scatter(y=i, x=

'cty'

, data=df_median.loc[df_median.index==make, :], s=

75

, c=

'firebrick'

)

# Annotate

ax.text(

33

,

13

,

"$red ; dots ; are ; the : median$"

, fontdict={

'size'

:

12

}, color=

'firebrick'

)

# Decorations

red_patch = plt.plot([],[], marker=

"o"

, ms=

10

, ls=

""

, mec=

None

, color=

'firebrick'

, label=

"Median"

)

plt.legend(handles=red_patch)

ax.set_title(

'Distribution of City Mileage by Make'

, fontdict={

'size'

:

22

})

ax.set_xlabel(

'Miles Per Gallon (City)'

, alpha=

0.7

)

ax.set_yticks(df.index)

ax.set_yticklabels(df.manufacturer.str.title(), fontdict={

'horizontalalignment'

:

'right'

}, alpha=

0.7

)

ax.set_xlim(

1

,

40

)

plt.xticks(alpha=

0.7

)

plt.gca().spines[

"top"

].set_visible(

False

)

plt.gca().spines[

"bottom"

].set_visible(

False

)

plt.gca().spines[

"right"

].set_visible(

False

)

plt.gca().spines[

"left"

].set_visible(

False

)

plt.grid(axis=

'both'

, alpha=

.4

, linewidth=

.1

)

plt.show()

本文參考自：

https://www.machinelearningplus.com/plots/top-50-matplotlib-visualizations-the-master-plots-python/

dignews.cc

最有用的25個Matplotlib圖（含Python程式碼模板）

關於我們

相關文章

動態連結的魔法：Linux下動態連結庫機制探討

R包vegan的Manteltests

使用DeepSeek拯救資料中臺

零基礎入門：DeepSeek微調教程來了！

[電腦]FutureMachSF-I&MSIMPGZ890IEDGETIWIFI

[第十屆機王爭霸賽]無光亂鬥組——「靜·雅」byfu.

Linux常用命令列：解鎖你的生產力

輕鬆搞定，資料開發與分析也能如此簡單方便！

[評測]MSIMPGX870EEdgeTiWiFi評測

AX錄取捷報｜2025美國英國本科Offer（一）