Model-free techniques, such as machine learning (ML), have recently attracted much interest toward the physical layer design (e.g., symbol detection, channel estimation, and beamforming). Most of these ML techniques employ centralized learning (CLK) schemes and assume the availability of datasets at a parameter server (PS), demanding the transmission of data from edge devices, such as mobile phones, to the PS. Exploiting the data generated at the edge, federated learning (FL) has been proposed recently as a distributed learning scheme, in which each device computes the model parameters and sends them to the PS for model aggregation, while the datasets are kept intact at the edge. Thus, FL is more communication-efficient and privacy-preserving than CL and applicable to the wireless communication scenarios, wherein the data are generated at the edge devices. This article presents the recent advances in FL-based training for physical layer design problems. Compared to CL, the effectiveness of FL is presented in terms of communication overhead with a slight performance loss in the learning accuracy. The design challenges, such as model, data, and hardware complexity, are also discussed in detail along with possible solutions.