Peak finding tells you where the diffraction peaks are. Peak fitting tells you everything else: the precise position (to better than 0.01°), the full width at half maximum (FWHM) that encodes crystallite size and strain, the integrated intensity for quantitative phase analysis, and the profile shape that reveals the relative contributions of instrumental broadening versus sample effects. If peak finding is the reconnaissance, peak fitting is the measurement.
This is the second post in my XRD processing series. I am assuming you have already found your peak positions using the methods from the previous post and now want to extract quantitative information from each reflection. We will cover the three standard profile functions — Gaussian, Lorentzian, and Voigt — understand when each is appropriate, and implement the fitting in Python.
1. Why Profile Shape Matters
A diffraction peak is never a delta function. It is broadened by two fundamentally different mechanisms:
Instrumental Broadening
Caused by the finite width of the X-ray source, axial divergence, slit geometry, and detector response. This contribution is approximately Gaussian in shape. It is a property of your diffractometer, not your sample, and can be measured using a standard reference material (LaB6, Si, or CeO2).
Sample Broadening
Caused by finite crystallite size (Scherrer broadening) and microstrain (non-uniform lattice distortions). Size broadening produces a Lorentzian profile shape, while strain broadening is typically Gaussian. Real samples usually have both, so the sample contribution is a mixture.
The observed peak profile is the convolution of these contributions. Since the convolution of a Gaussian with a Lorentzian is a Voigt function, the Voigt profile (or its computationally cheaper approximation, the pseudo-Voigt) is the physically motivated choice for XRD peak fitting.
2. The Three Profile Functions
Gaussian
where \(I_0\) is the peak amplitude, \(2\theta_0\) is the peak center, and \(w_G\) is the FWHM. The Gaussian drops off quickly — by the time you are 2× the FWHM away from center, the intensity is essentially zero. This makes Gaussians easy to fit but poor at capturing the extended tails of real diffraction peaks.
Lorentzian (Cauchy)
The Lorentzian has the same three parameters but much heavier tails — it decays as \(1/(2\theta)^2\) rather than exponentially. This means peaks overlap more, and fitting becomes sensitive to how much of the tail region you include. Lorentzians are physically appropriate for size broadening of very small crystallites.
Voigt and Pseudo-Voigt
The true Voigt function is the convolution \(V = G \otimes L\), which has no closed-form expression (it involves the Faddeeva function). In practice, most XRD software uses the pseudo-Voigt approximation — a linear combination rather than a convolution:
where \(\eta\) is the mixing parameter (\(0 \leq \eta \leq 1\)). When \(\eta = 0\), the profile is pure Gaussian; when \(\eta = 1\), pure Lorentzian. For most powder diffraction data, \(\eta\) falls between 0.3 and 0.7. The pseudo-Voigt is remarkably accurate — the maximum error relative to a true Voigt is less than 1% for typical XRD line shapes.
A low \(\eta\) (closer to Gaussian) suggests the broadening is dominated by strain and instrumental effects. A high \(\eta\) (closer to Lorentzian) indicates dominant size broadening from small crystallites. For our Nd0.7Sr0.3MnO3 nanoparticles (crystallite size ~20–40 nm), we typically see \(\eta \approx 0.5\text{–}0.6\), reflecting the balance between size and strain contributions.
3. Fitting Parameters Summary
| Parameter | Symbol | Physical Meaning |
|---|---|---|
| Peak center | \(2\theta_0\) | Bragg angle → d-spacing via Bragg's law |
| Amplitude | \(I_0\) | Peak height above baseline |
| FWHM | \(w\) | Full width at half maximum → size & strain via Scherrer/W-H |
| Mixing parameter | \(\eta\) | Lorentzian fraction (pseudo-Voigt only) |
| Integrated intensity | \(A\) | Area under peak → phase fraction, structure factor |
4. Python Implementation
Here is a complete implementation that fits each detected peak with a pseudo-Voigt profile using scipy.optimize.curve_fit. The key insight is to fit each peak individually within a local window (typically ±5× the estimated FWHM) rather than trying to fit the entire pattern at once.
from scipy.optimize import curve_fit
from scipy.signal import peak_widths
def gaussian(x, amp, cen, wid):
"""Gaussian profile. wid = FWHM."""
return amp * np.exp(-4 * np.log(2) * ((x - cen) / wid)**2)
def lorentzian(x, amp, cen, wid):
"""Lorentzian profile. wid = FWHM."""
return amp / (1 + 4 * ((x - cen) / wid)**2)
def pseudo_voigt(x, amp, cen, wid, eta):
"""Pseudo-Voigt: eta*Lorentzian + (1-eta)*Gaussian."""
return eta * lorentzian(x, amp, cen, wid) + \
(1 - eta) * gaussian(x, amp, cen, wid)
"""Fit pseudo-Voigt to each detected peak.
Parameters
----------
two_theta : array, 2θ values
intensity : array, background-subtracted intensity
peak_indices : array, indices of detected peaks
window_deg : float, half-width of fitting window in degrees
"""
results = []
for idx in peak_indices:
cen0 = two_theta[idx]
amp0 = intensity[idx]
# Select local window around peak
mask = (two_theta > cen0 - window_deg) & \
(two_theta < cen0 + window_deg)
x = two_theta[mask]
y = intensity[mask]
# Initial guess: [amplitude, center, FWHM, eta]
p0 = [amp0, cen0, 0.15, 0.5]
# Bounds: amp>0, center within window,
# FWHM between 0.01° and 2°, 0≤eta≤1
bounds = ([0, cen0-0.5, 0.01, 0],
[amp0*2, cen0+0.5, 2.0, 1])
try:
popt, pcov = curve_fit(pseudo_voigt, x, y,
p0=p0, bounds=bounds,
maxfev=5000)
perr = np.sqrt(np.diag(pcov))
# Calculate integrated intensity (area)
x_fine = np.linspace(x[0], x[-1], 1000)
area = np.trapz(pseudo_voigt(x_fine, *popt), x_fine)
results.append({
'center': popt[1],
'center_err': perr[1],
'amplitude': popt[0],
'fwhm': popt[2],
'fwhm_err': perr[2],
'eta': popt[3],
'area': area,
'popt': popt,
'x_data': x,
'y_data': y
})
except RuntimeError:
print(f' Fit failed for peak at {cen0:.2f}°')
return results
5. Evaluating Fit Quality
A fit is only as good as its residuals. After fitting, always compute and plot the difference between the data and the fitted profile. Systematic patterns in the residuals — oscillations, asymmetric deviations, or trends — indicate a poor model choice or an overlapping peak that was not accounted for.
Python — Plotting fits with residualsdef plot_fit(result, idx):
x, y = result['x_data'], result['y_data']
x_fine = np.linspace(x[0], x[-1], 500)
y_fit = pseudo_voigt(x_fine, *result['popt'])
y_calc = pseudo_voigt(x, *result['popt'])
residual = y - y_calc
fig, (ax1, ax2) = plt.subplots(2, 1,
figsize=(8, 5), height_ratios=[3, 1],
sharex=True, gridspec_kw={'hspace': 0.05})
# Main plot
ax1.plot(x, y, 'ko', ms=3, label='Data')
ax1.plot(x_fine, y_fit, 'r-', lw=1.5,
label=f'pV fit (η={result["eta"]:.2f})')
ax1.set_ylabel('Intensity')
ax1.legend()
ax1.set_title(f'Peak #{idx+1}: 2θ = {result["center"]:.3f}°, '
f'FWHM = {result["fwhm"]:.4f}°')
# Residual plot
ax2.plot(x, residual, 'b-', lw=0.8)
ax2.axhline(0, color='gray', ls='--', lw=0.5)
ax2.set_xlabel(r'$2\theta$ (°)')
ax2.set_ylabel('Residual')
plt.tight_layout()
plt.show()
Good residuals look like random noise centered on zero. If you see a W-shape (positive-negative-positive), the peak tails are not captured well — try increasing \(\eta\) or switching to a true Voigt function. If you see asymmetry, your peak may have intrinsic asymmetry from axial divergence (common at low \(2\theta\)) that requires a more complex profile function.
6. Handling Overlapping Peaks
In many materials — especially lower-symmetry structures like orthorhombic perovskites — some reflections overlap. For example, in Pbnm Nd0.7Sr0.3MnO3, the (200) and (112) reflections near \(2\theta \approx 46\)° often partially overlap. In these cases, you need to fit multiple peaks simultaneously within the same window:
Python — Fitting two overlapping peaks"""Sum of two pseudo-Voigt profiles."""
return pseudo_voigt(x, a1, c1, w1, eta1) + \
pseudo_voigt(x, a2, c2, w2, eta2)
# Example: fitting a doublet near 46°
mask = (two_theta > 44.5) & (two_theta < 47.5)
x, y = two_theta[mask], I_smooth[mask]
# Initial guesses for two peaks
p0 = [500, 45.8, 0.2, 0.5, # peak 1
400, 46.5, 0.2, 0.5] # peak 2
popt, pcov = curve_fit(double_pseudo_voigt, x, y, p0=p0)
If two peaks are separated by less than about 0.5× their individual FWHM, the deconvolution becomes unreliable — the fit parameters are strongly correlated, and small changes in initial guesses produce very different results. In such cases, it is better to report a single blended peak and note the overlap, rather than forcing a dubious decomposition. Rietveld refinement handles this more robustly because it constrains peak positions through the crystal structure model.
7. Extracting the FWHM: Input for Size-Strain Analysis
The primary output of peak fitting for crystallographic analysis is the FWHM as a function of \(2\theta\). This table becomes the input for Williamson-Hall analysis (covered in the next post) and Scherrer size estimation.
Before using the FWHM values, you must subtract the instrumental contribution. If you have measured a standard reference material under identical conditions, the instrumental FWHM (\(\beta_\text{inst}\)) at each \(2\theta\) can be interpolated from that measurement. The sample broadening is then:
For a pseudo-Voigt profile, you can use the Thompson-Cox-Hastings approximation to separate the Gaussian and Lorentzian FWHM components and correct each one appropriately. In practice, for nanocrystalline samples where the sample broadening dominates, the simpler Gaussian deconvolution is usually sufficient.
Python — Collecting FWHM values for Williamson-Hallprint(f'{"Peak":>5} {"2θ (°)":>10} {"FWHM (°)":>10} {"η":>6} {"Area":>10}')
for i, r in enumerate(results):
print(f'{i+1:5d} {r["center"]:10.3f} {r["fwhm"]:10.4f} '
f'{r["eta"]:6.2f} {r["area"]:10.1f}')
# Convert FWHM from degrees to radians for W-H analysis
fwhm_rad = np.array([r['fwhm'] for r in results]) * np.pi / 180
centers_rad = np.array([r['center'] for r in results]) * np.pi / 360 # θ, not 2θ
8. Which Profile Should You Use?
Here is a practical decision tree:
- Quick look / exploratory analysis: Gaussian fits are fast, converge easily, and give you approximate FWHM values. Good enough for a first pass.
- Publication-quality single-peak analysis: Pseudo-Voigt. It captures both the peak shape and the mixing parameter \(\eta\), which is itself physically meaningful.
- Rietveld refinement: The refinement software (FullProf, GSAS-II) handles profile functions internally — typically Thompson-Cox-Hastings pseudo-Voigt. You do not need to fit peaks manually.
- Very asymmetric peaks: At low \(2\theta\) (< 30°), axial divergence causes pronounced asymmetry. Consider the split pseudo-Voigt or Finger-Cox-Jephcoat correction, both available in Rietveld codes.
9. Common Pitfalls
The Scherrer equation and Williamson-Hall plot require \(\beta\) in radians, but fitting always gives you FWHM in degrees. Forgetting to convert is the most common source of crystallite sizes that are off by a factor of 57.
If you skip the instrumental correction, your crystallite sizes will be systematically too small. For well-crystallized samples (narrow peaks), the instrumental contribution can be a significant fraction of the observed width. Always measure a standard or use the instrument's known broadening function.
The Lorentzian and Voigt tails extend far from the peak center. If your fitting window is too narrow, the tail contribution is cut off and the FWHM is underestimated. A good rule: the window should extend at least 5× the expected FWHM on each side of the peak center.
If you did not strip K\(\alpha_2\) during preprocessing, each peak is actually a doublet. Fitting a single profile to a doublet will give you a systematically broadened FWHM and a distorted shape. Either strip K\(\alpha_2\) first (Rachinger correction) or fit each peak as a doublet with a fixed intensity ratio of 0.5 and a fixed splitting from the known \(\lambda_{\alpha_1}\)/\(\lambda_{\alpha_2}\) wavelengths.