This paper introduces attentional feature fusion, a uniform scheme for merging features from different network layers or branches. The method utilizes a multi-scale channel attention module for fusing inconsistent semantics and scales, and iterative attentional feature fusion to alleviate integration bottlenecks.