ARaNet: Attention and Residual Aware Network for Resilient Digital Twins in Rail Transit Equipment Manufacturing
DOI:
https://doi.org/10.4108/eetsis.13054Keywords:
3D sensor data recovery, digital twin resilience, rail transit equipment manufacturing, attention mechanism, residual refinement, point cloud completionAbstract
INTRODUCTION: In rail transit equipment manufacturing, which encompasses locomotive body welding, metro vehicle assembly, and high-speed rail carriage production, high-fidelity 3D point cloud data acquired by industrial sensors serves as the foundation for digital twin modeling, automated quality inspection, and robotic guidance. However, harsh production environments characterized by metallic dust, welding spatter, mechanical vibration, and frequent occlusions by fixtures and tooling inevitably introduce severe data corruption, including missing regions and non-uniform point density. Such degraded sensor data undermines the reliability of downstream manufacturing processes that depend on accurate 3D representations.
OBJECTIVES: This paper presents an effective 3D sensor data recovery method that reconstructs complete geometric representations from corrupted partial scans, specifically targeting the data integrity challenges encountered in rail transit equipment manufacturing. The proposed approach aims to simultaneously restore global structural completeness and local geometric precision, thereby enabling resilient digital twin systems that maintain operational continuity despite sensor-induced data loss.
METHODS:We propose an Attention and Residual Aware Network (ARaNet) featuring a Multi-Scale Channel-Aware Convolution encoder and a Hierarchical Residual-Aware Decoder. The encoder performs Farthest Point Sampling at multiple resolutions (2048, 1024, 512 points) and applies a channel attention mechanism to dynamically weight feature channels, emphasizing geometrically salient structures such as sharp edges, curved surfaces, and mechanical joints that are critical in rail transit component geometries, including bogie frames, car body shells, and coupler assemblies. The decoder progressively generates point clouds from coarse resolution (64 points) to medium resolution (128 points) to high resolution (2048 points), incorporating a residual refinement module that predicts coordinate offsets to rectify local geometric errors. This capability is particularly valuable for precision metrology in component manufacturing.
RESULTS: Experimental evaluation on the ShapeNet and ModelNet40 datasets demonstrates that ARaNet achieves substantial improvements over benchmark models PCN and PF-Net. Quantitative assessment shows average reductions of 12.9% and 23.8% in the Gt-to-Pre and Pre-to-Gt metrics, respectively. The generated point clouds exhibit more uniform distributions and superior detail restoration, particularly for complex mechanical geometries with curved surfaces and structural joints characteristic of rail transit equipment components.
Conclusion:The effectiveness of integrating channel attention with residual refinement mechanisms for industrial 3D sensor data recovery is validated. ARaNet provides a robust data recovery foundation for resilient digital twin systems in rail transit equipment manufacturing, enabling sustained operational capability and rapid geometric reconstruction even when sensor data is compromised by production environment disruptions. However, it should be noted that ARaNet is currently validated exclusively on synthetic training data, and its performance under extremely high missing-region ratios or in real industrial deployment scenarios without domain adaptation warrants further investigation.
References
[1] Kagermann H, Wahlster W, Helbig J. Recommendations for implementing the strategic initiative INDUSTRIE 4.0: Final report of the Industrie 4.0 Working Group. Frankfurt: National Academy of Science and Engineering; 2013.
[2] Grieves M, Vickers J. Digital twin: Mitigating unpredictable, undesirable emergent behavior in complex systems. In: Kahlen FJ, Flumerfelt S, Alves A, editors. Transdisciplinary Perspectives on Complex Systems. Cham: Springer; 2017. p. 85-113.
[3] Gao W, Kim SW, Bosse H, Haitjema H, Chen YL, Lu XD, et al. Measurement technologies for precision positioning. CIRP Annals. 2015;64(2):773-796.
[4] Zhang Z. Microsoft Kinect sensor and its effect. IEEE Multimedia. 2012;19(2):4-10.
[5] Yuan W, Khot T, Held D, Mertz C, Hebert M. PCN: Point Completion Network. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2018 June 18-23; Salt Lake City, UT, USA. Los Alamitos, CA: IEEE Computer Society; 2018. p. 670-679.
[6] Huang Z, Yu Y, Xu J, Ni F, Le X. PF-Net: Point Fractal Network for 3D Point Cloud Completion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2020 June 13-19; Seattle, WA, USA. Los Alamitos, CA: IEEE Computer Society; 2020. p. 7659-7667.
[7] Wen X, Li T, Han Z, Liu YS. Point Cloud Completion by Skip-attention Network with Hierarchical Folding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2020 June 13-19; Seattle, WA, USA. Los Alamitos, CA: IEEE Computer Society; 2020. p. 1936-1945.
[8] Yu X, Rao Y, Wang Z, Lu J, Zhou J. AdaPoinTr: Diverse Point Cloud Completion With Adaptive Geometry-Aware Transformers. IEEE Trans Pattern Anal Mach Intell. 2023;45(12):14114-14130.
[9] Zhou H, Cao Y, Chu W, Zhu J, Lu T, Tai Y, Wang C. SeedFormer: Patch Seeds Based Point Cloud Completion with Upsample Transformer. In: Avidan S, Brostow G, Cissé M, Farinella GM, Hassner T, editors. Computer Vision -- ECCV 2022. Lecture Notes in Computer Science, vol 13663. Cham, Switzerland: Springer; 2022. p. 416-432.
[10] Qi CR, Yi L, Su H, Guibas LJ. PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NeurIPS); 2017 December 4-9; Long Beach, CA, USA. Red Hook, NY: Curran Associates Inc.; 2017. p. 5099-5108.
[11] Hu J, Shen L, Albanie S, Sun G, Wu E. Squeeze-and-Excitation Networks. IEEE Trans Pattern Anal Mach Intell. 2020;42(8):2011-2023.
[12] Altintas Y, Brecher C, Weck M, Witt S. Virtual machine tool. CIRP Annals. 2005;54(2):115-138.
[13] Dai A, Qi CR, Nießner M. Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2017 July 21-26; Honolulu, HI, USA. Los Alamitos, CA: IEEE Computer Society; 2017. p. 5868-5877.
[14] Tatarchenko M, Dosovitskiy A, Brox T. Octree Generating Networks: Efficient Convolutional Architectures for High-resolution 3D Outputs. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV); 2017 October 22-29; Venice, Italy. Los Alamitos, CA: IEEE Computer Society; 2017. p. 2088-2096.
[15] Han X, Li Z, Huang H, Kalogerakis E, Yu Y. High-Resolution Shape Completion using Deep Neural Networks for Global Structure and Local Geometry Inference. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV); 2017 October 22-29; Venice, Italy. Los Alamitos, CA: IEEE Computer Society; 2017. p. 85-93.
[16] Guo Y, Wang H, Hu Q, Liu H, Liu L, Bennamoun M. Deep Learning for 3D Point Clouds: A Survey. IEEE Trans Pattern Anal Mach Intell. 2021;43(12):4338-4364.
[17] Wang Y, Sun Y, Liu Z, Sarma SE, Bronstein MM, Solomon JM. Dynamic Graph CNN for Learning on Point Clouds. ACM Trans Graph. 2019;38(5):Article 146, 1-12.
[18] Sun Y, Wang Y, Liu Z, Siegel J, Sarma S. PointGrow: Autoregressively Learned Point Cloud Generation with Self-Attention. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV); 2020 March 1-5; Snowmass Village, CO, USA. Los Alamitos, CA: IEEE Computer Society; 2020. p. 61-70.
[19] Xiang P, Wen X, Liu YS, Cao YP, Wan P, Zheng W, Han Z. SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV); 2021 October 11-17; Montreal, Canada. Los Alamitos, CA: IEEE Computer Society; 2021. p. 5499-5509.
[20] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I. Attention is All You Need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NeurIPS); 2017 December 4-9; Long Beach, CA, USA. Red Hook, NY: Curran Associates Inc.; 2017. p. 6000-6010.
[21] Woo S, Park J, Lee JY, Kweon IS. CBAM: Convolutional Block Attention Module. In: Proceedings of the 15th European Conference on Computer Vision (ECCV); 2018 September 8-14; Munich, Germany. Cham, Switzerland: Springer; 2018. p. 3-19.
[22] He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2016 June 27-30; Las Vegas, NV, USA. Los Alamitos, CA: IEEE Computer Society; 2016. p. 770-778.
[23] Xie H, Yao H, Fang Y. GRNet: Gridding Residual Network for Dense Point Cloud Completion. Neurocomputing. 2022;481:171-181.
[24] Zhang W, Wang Z, Cai J. Detail Preserving Point Cloud Completion via Targeted Feature Expansion. IEEE Trans Image Process. 2021;30:3608-3619.
[25] Ioffe S, Szegedy C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In: Proceedings of the 32nd International Conference on Machine Learning (ICML); 2015 July 6-11; Lille, France. New York, NY: JMLR.org; 2015. p. 448-456.
[26] Nair V, Hinton GE. Rectified Linear Units Improve Restricted Boltzmann Machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML); 2010 June 21-24; Haifa, Israel. Madison, WI: Omnipress; 2010. p. 807-814.
[27] Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y. Generative Adversarial Nets. In: Proceedings of the 28th International Conference on Neural Information Processing Systems (NeurIPS); 2014 December 8-13; Montreal, Canada. Cambridge, MA: MIT Press; 2014. p. 2672-2680.
[28] Wu Z, Song S, Khosla A, Yu F, Zhang L, Tang X, Xiao J. 3D ShapeNets: A Deep Representation for Volumetric Shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2015 June 7-12; Boston, MA, USA. Los Alamitos, CA: IEEE Computer Society; 2015. p. 1912-1920.
[29] Fan H, Su H, Guibas LJ. A Point Set Generation Network for 3D Object Reconstruction from a Single Image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR); 2017 July 21-26; Honolulu, HI, USA. Los Alamitos, CA: IEEE Computer Society; 2017. p. 5868-5877.
[30] Rubner Y, Tomasi C, Guibas LJ. The Earth Mover's Distance as a Metric for Image Retrieval. International Journal of Computer Vision. 2000;40(2):99-121.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Xi Chen, Xiaolong Gao, Hanyue Zhan, Wanting Liu

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.