15,045 results on '"Receptive field"'
Search Results
2. A new role for excitation in the retinal direction‐selective circuit.
- Author
-
Ankri, Lea, Riccitelli, Serena, and Rivlin‐Etzion, Michal
- Subjects
- *
RECEPTIVE fields (Neurology) , *RETINAL ganglion cells , *VISUAL acuity , *GABA , *ELECTROPHYSIOLOGY - Abstract
A key feature of the receptive field of neurons in the visual system is their centre–surround antagonism, whereby the centre and the surround exhibit responses of opposite polarity. This organization is thought to enhance visual acuity, but whether and how such antagonism plays a role in more complex processing remains poorly understood. Here, we investigate the role of centre and surround receptive fields in retinal direction selectivity by exposing posterior‐preferring On–Off direction‐selective ganglion cells (pDSGCs) to adaptive light and recording their response to globally moving objects. We reveal that light adaptation leads to surround expansion in pDSGCs. The pDSGCs maintain their original directional tuning in the centre receptive field, but present the oppositely tuned response in their surround. Notably, although inhibition is the main substrate for retinal direction selectivity, we found that following light adaptation, both the centre‐ and surround‐mediated responses originate from directionally tuned excitatory inputs. Multi‐electrode array recordings show similar oppositely tuned responses in other DSGC subtypes. Together, these data attribute a new role for excitation in the direction‐selective circuit. This excitation carries an antagonistic centre–surround property, possibly designed to sharpen the detection of motion direction in the retina. Key points: Receptive fields of direction‐selective retinal ganglion cells expand asymmetrically following light adaptation.The increase in the surround receptive field generates a delayed spiking phase that is tuned to the null direction and is mediated by excitation.Following light adaptation, excitation rules the computation in the centre receptive field and is tuned to the preferred direction.GABAergic and glycinergic inputs modulate the null‐tuned delayed response differentially.Null‐tuned delayed spiking phases can be detected in all types of direction‐selective retinal ganglion cells.Light adaptation exposes a hidden directional excitation in the circuit, which is tuned to opposite directions in the centre and surround receptive fields. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Alpha-2 nicotinic acetylcholine receptors regulate spectral integration in auditory cortex.
- Author
-
Intskirveli, Irakli, Gil, Susan, Lazar, Ronit, and Metherate, Raju
- Subjects
NICOTINIC acetylcholine receptors ,PYRAMIDAL neurons ,NEURAL circuitry ,AUDITORY neurons ,AUDITORY cortex ,NICOTINIC receptors - Abstract
Introduction: In primary auditory cortex (A1), nicotinic acetylcholine receptors (nAChRs) containing α2 subunits are expressed in layer 5 Martinotti cells (MCs)—inhibitory interneurons that send a main axon to superficial layers to inhibit distal apical dendrites of pyramidal cells (PCs). MCs also contact interneurons in supragranular layers that, in turn, inhibit PCs. Thus, MCs may regulate PCs via inhibition and disinhibition, respectively, of distal and proximal apical dendrites. Auditory inputs to PCs include thalamocortical inputs to middle layers relaying information about characteristic frequency (CF) and near-CF stimuli, and intracortical long-distance ("horizontal") projections to multiple layers carrying information about spectrally distant ("nonCF") stimuli. CF and nonCF inputs integrate to create broad frequency receptive fields (RFs). Systemic administration of nicotine activates nAChRs to "sharpen" RFs—to increase gain within a narrowed RF—resulting in enhanced responses to CF stimuli and reduced responses to nonCF stimuli. While nicotinic mechanisms to increase gain have been identified, the mechanism underlying RF narrowing is unknown. Methods: Here, we examine the role of α2 nAChRs in mice with α2 nAChR-expressing neurons labeled fluorescently, and in mice with α2 nAChRs genetically deleted. Results: The distribution of fluorescent neurons in auditory cortex was consistent with previous studies demonstrating α2 nAChRs in layer 5 MCs, including nonpyramidal somata in layer 5 and dense processes in layer 1. We also observed label in subcortical auditory regions, including processes, but no somata, in the medial geniculate body, and both fibers and somata in the inferior colliculus. Using electrophysiological (current-source density) recordings in α2 nAChR knock-out mice, we found that systemic nicotine failed to enhance CF-evoked inputs to layer 4, suggesting a role for subcortical α2 nAChRs, and failed to reduce nonCF-evoked responses, suggesting that α2 nAChRs regulate horizontal projections to produce RF narrowing. Discussion: The results support the hypothesis that α2 nAChRs function to simultaneously enhance RF gain and narrow RF breadth in A1. Notably, a similar neural circuit may recur throughout cortex and hippocampus, suggesting widespread conserved functions regulated by α2 nAChRs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Cross-scale information enhancement for object detection.
- Author
-
Li, Tie-jun and Zhao, Hui-feng
- Subjects
MULTISCALE modeling ,PROBLEM solving ,DETECTORS ,INFORMATION design - Abstract
Object detection usually adopts multi-scale fusion to enrich the information of the object, and the Feature Pyramid Network (FPN) is a common method for multi-scale fusion. However, traditional fusion methods such as FPN cause information loss when fusing high-level feature maps with low-level feature maps. To solve these problems, we propose a simple but effective cross-scale fusion method that fully uses the information of multi-scale feature maps. In addition, to better utilize the multi-scale contextual information, we designed the Selective Information Enhancement (SIE) module. The SIE dynamically selects information at more important scales for objects of different size and fuse the selected information with feature maps for information enhancement. Apply our method to Single Shot Multibox Detector (SSD) and propose a Cross-Scale Information Enhancement Single Shot Multibox Detector (CESSD). The CESSD improves the object detection capability of SSD models by fusing multi-scale features and selectively enhancing feature map information. To evaluate the effectiveness of the model, we validated it on the Pascal VOC2007 test set for 300 × 300 inputs, and the mean Average Precision (mAP) of CESSD reached 79.8%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Grape clusters detection based on multi-scale feature fusion and augmentation.
- Author
-
Ma, Jinlin, Xu, Silong, Ma, Ziping, Fu, Hong, and Lin, Baobao
- Abstract
This paper addresses the challenge of low detection accuracy of grape clusters caused by scale differences, illumination changes, and occlusion in realistic and complex scenes. We propose a multi-scale feature fusion and augmentation YOLOv7 network to enhance the detection accuracy of grape clusters across variable environments. First, we design a Multi-Scale Feature Extraction Module (MSFEM) to enhance feature extraction for small-scale targets. Second, we propose the Receptive Field Augmentation Module (RFAM), which uses dilated convolution to expand the receptive field and enhance the detection accuracy for objects of various scales. Third, we present the Spatial Pyramid Pooling Cross Stage Partial Concatenation Faster (SPPCSPCF) module to fuse multi-scale features, improving accuracy and speeding up model training. Finally, we integrate the Residual Global Attention Mechanism (ResGAM) into the network to better focus on crucial regions and features. Experimental results show that our proposed method achieves a mAP 0.5 of 93.29% on the GrappoliV2 dataset, an improvement of 5.39% over YOLOv7. Additionally, our method increases Precision, Recall, and F1 score by 2.83%, 3.49%, and 0.07, respectively. Compared to state-of-the-art detection methods, our approach demonstrates superior detection performance and adaptability to various environments for detecting grape clusters. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Stage-by-Stage Adaptive Alignment Mechanism for Object Detection in Aerial Images.
- Author
-
Zhu, Jiangang, Jing, Donglin, and Gao, Dapeng
- Subjects
REMOTE sensing ,DYNAMIC models ,DETECTORS ,ANCHORS ,CLASSIFICATION - Abstract
Object detection in aerial images has had a broader range of applications in the past few years. Unlike the targets in the images of horizontal shooting, targets in aerial photos generally have arbitrary orientation, multi-scale, and a high aspect ratio. Existing methods often employ a classification backbone network to extract translation-equivariant features (TEFs) and utilize many predefined anchors to handle objects with diverse appearance variations. However, they encounter misalignment at three levels, spatial, feature, and task, during different detection stages. In this study, we propose a model called the Staged Adaptive Alignment Detector (SAADet) to solve these challenges. This method utilizes a Spatial Selection Adaptive Network (SSANet) to achieve spatial alignment of the convolution receptive field to the scale of the object by using a convolution sequence with an increasing dilation rate to capture the spatial context information of different ranges and evaluating this information through model dynamic weighting. After correcting the preset horizontal anchor to an oriented anchor, feature alignment is achieved through the alignment convolution guided by oriented anchor to align the backbone features with the object's orientation. The decoupling of features using the Active Rotating Filter is performed to mitigate inconsistencies due to the sharing of backbone features in regression and classification tasks to accomplish task alignment. The experimental results show that SAADet achieves equilibrium in speed and accuracy on two aerial image datasets, HRSC2016 and UCAS-AOD. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. SmokeFireNet: A Lightweight Network for Joint Detection of Forest Fire and Smoke.
- Author
-
Chen, Yi and Wang, Fang
- Subjects
FOREST fires ,EXTREME weather ,FOREST protection ,POLLUTION ,FOREST microclimatology - Abstract
In recent years, forest fires have been occurring frequently around the globe, affected by extreme weather and dry climate, causing serious economic losses and environmental pollution. In this context, timely detection of forest fire smoke is crucial for realizing real-time early warning of fires. However, fire and smoke from forest fires can spread to cover large areas and may affect distant areas. In this paper, a lightweight joint forest fire and smoke detection network, SmokeFireNet, is proposed, which employs ShuffleNetV2 as the backbone for efficient feature extraction, effectively addressing the computational efficiency challenges of traditional methods. To integrate multi-scale information and enhance the semantic feature extraction capability, a feature pyramid network (FPN) and path aggregation network (PAN) are introduced in this paper. In addition, the FPN network is optimized by a lightweight DySample upsampling operator. The model also incorporates efficient channel attention (ECA), which can pay more attention to the detection of forest fires and smoke regions while suppressing irrelevant features. Finally, by embedding the receptive field block (RFB), the model further improves its ability to understand contextual information and capture detailed features of fire and smoke, thus improving the overall detection accuracy. The experimental results show that SmokeFireNet is better than other mainstream target detection algorithms in terms of average AP
all of 86.2%, FPS of 114, and GFLOPs of 8.4, and provides effective technical support for forest fire prevention work in terms of average precision, frame rate, and computational complexity. In the future, the SmokeFireNet model is expected to play a greater role in the field of forest fire prevention and make a greater contribution to the protection of forest resources and the ecological environment. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
8. MRI-Based Brain Tumor Classification Using a Dilated Parallel Deep Convolutional Neural Network.
- Author
-
Rahman, Takowa, Islam, Md Saiful, and Uddin, Jia
- Subjects
BRAIN tumors ,CONVOLUTIONAL neural networks ,MACHINE learning ,DATA analysis ,ACCURACY - Abstract
Brain tumors are frequently classified with high accuracy using convolutional neural networks (CNNs) to better comprehend the spatial connections among pixels in complex pictures. Due to their tiny receptive fields, the majority of deep convolutional neural network (DCNN)-based techniques overfit and are unable to extract global context information from more significant regions. While dilated convolution retains data resolution at the output layer and increases the receptive field without adding computation, stacking several dilated convolutions has the drawback of producing a grid effect. This research suggests a dilated parallel deep convolutional neural network (PDCNN) architecture that preserves a wide receptive field in order to handle gridding artifacts and extract both coarse and fine features from the images. This article applies multiple preprocessing strategies to the input MRI images used to train the model. By contrasting various dilation rates, the global path uses a low dilation rate (2,1,1), while the local path uses a high dilation rate (4,2,1) for decremental even numbers to tackle gridding artifacts and to extract both coarse and fine features from the two parallel paths. Using three different types of MRI datasets, the suggested dilated PDCNN with the average ensemble method performs best. The accuracy achieved for the multiclass Kaggle dataset-III, Figshare dataset-II, and binary tumor identification dataset-I is 98.35%, 98.13%, and 98.67%, respectively. In comparison to state-of-the-art techniques, the suggested structure improves results by extracting both fine and coarse features, making it efficient. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Grape clusters detection based on multi-scale feature fusion and augmentation
- Author
-
Jinlin Ma, Silong Xu, Ziping Ma, Hong Fu, and Baobao Lin
- Subjects
Grape clusters detection ,Multi-scale ,Receptive field ,Feature fusion ,Feature augmentation ,Medicine ,Science - Abstract
Abstract This paper addresses the challenge of low detection accuracy of grape clusters caused by scale differences, illumination changes, and occlusion in realistic and complex scenes. We propose a multi-scale feature fusion and augmentation YOLOv7 network to enhance the detection accuracy of grape clusters across variable environments. First, we design a Multi-Scale Feature Extraction Module (MSFEM) to enhance feature extraction for small-scale targets. Second, we propose the Receptive Field Augmentation Module (RFAM), which uses dilated convolution to expand the receptive field and enhance the detection accuracy for objects of various scales. Third, we present the Spatial Pyramid Pooling Cross Stage Partial Concatenation Faster (SPPCSPCF) module to fuse multi-scale features, improving accuracy and speeding up model training. Finally, we integrate the Residual Global Attention Mechanism (ResGAM) into the network to better focus on crucial regions and features. Experimental results show that our proposed method achieves a mAP $$_{0.5}$$ 0.5 of 93.29% on the GrappoliV2 dataset, an improvement of 5.39% over YOLOv7. Additionally, our method increases Precision, Recall, and F1 score by 2.83%, 3.49%, and 0.07, respectively. Compared to state-of-the-art detection methods, our approach demonstrates superior detection performance and adaptability to various environments for detecting grape clusters.
- Published
- 2024
- Full Text
- View/download PDF
10. Feature-adaptive FPN with multiscale context integration for underwater object detection.
- Author
-
Bhalla, Shikha, Kumar, Ashish, and Kushwaha, Riti
- Abstract
Underwater object detection is vital for diverse applications, from studies in marine biology to underwater robotics. However, underwater environments pose unique challenges, including reduced visibility due to color distortion, light attenuation, and complex backgrounds. Traditional computer vision methods have limitations, prompting the implementation of deep learning, for underwater object detection. Despite progress, challenges persist, such as visual degradation, scale variations, diverse marine species, and complex backgrounds. To address these issues, we propose Feature-Adaptive FPN with Multiscale Context Integration (FA-FPN-MCI), a novel deep-learning algorithm aimed at enhancing both detection and domain generalization performance. We integrate the Style Normalization and Restitution (SNR) module for domain generalization, Receptive Field Blocks (RFBs) for fine-grained detail capture, and a twin-branch Global Context Module (TBGCM) for multiscale context information. We enhance lateral connections within the Feature Pyramid Network (FPN) with deformable convolution. Experimental outcome reveal that the proposed method attains mean average precision of 84.2%. Additionally, other performance metrics were evaluated, and outperforming all other methods used for comparison. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. PillarVTP: vehicle trajectory prediction method based on local point cloud aggregation and receptive field expansion.
- Author
-
Liao, Zhuhua, Yang, Jiyuan, Zhao, Yijiang, Liu, Yizhi, and Zhang, Hui
- Abstract
Vehicle trajectory prediction plays a crucial role in the control and safety warning of autonomous vehicles. Existing methods often depend on costly high definition (HD) maps for generating trajectories to fit their scenarios, or involve inefficient aggregation of local point clouds into voxels. Therefore, an end-to-end vehicle trajectory prediction method (PillarVTP) is proposed based on local point cloud aggregation and receptive field expansion. Firstly, we construct a novel pillar-based object detection network, introducing SPPCSPC which uses max pooling layers with multiple kernel sizes on a single feature level as the neck for extracting multi-scale features, and improving ResNet-18 by adding a depth stage to expand the receptive field at multiple levels. Then, we present performing feature upsampling to improve performance before predicting vehicle positions. And a shallow convolutional network is utilized to implement the future feature learning network, which learns future features from the previous features for predicting vehicle positions in future frames. Subsequently, the positions of vehicles are matched greedily from future frames to the current frame, and the matched future trajectories are associated with the vehicles detected in the current frame. Finally, the proposed PillarVTP is evaluated on the nuScenes and Argoverse 1 datasets. Experimental results demonstrate that PillarVTP outperforms recent end-to-end prediction method based on point cloud data, FutureDet, by 3.4% and surpasses traditional multi-stage method, Trajectron + + , by 13.7%. Furthermore, PillarVTP shows good robustness under various weather conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. MRI-Based Brain Tumor Classification Using a Dilated Parallel Deep Convolutional Neural Network
- Author
-
Takowa Rahman, Md Saiful Islam, and Jia Uddin
- Subjects
brain tumor classification ,data augmentation ,grid effect ,multiscale dilated parallel convolution ,machine learning classifiers ,receptive field ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Brain tumors are frequently classified with high accuracy using convolutional neural networks (CNNs) to better comprehend the spatial connections among pixels in complex pictures. Due to their tiny receptive fields, the majority of deep convolutional neural network (DCNN)-based techniques overfit and are unable to extract global context information from more significant regions. While dilated convolution retains data resolution at the output layer and increases the receptive field without adding computation, stacking several dilated convolutions has the drawback of producing a grid effect. This research suggests a dilated parallel deep convolutional neural network (PDCNN) architecture that preserves a wide receptive field in order to handle gridding artifacts and extract both coarse and fine features from the images. This article applies multiple preprocessing strategies to the input MRI images used to train the model. By contrasting various dilation rates, the global path uses a low dilation rate (2,1,1), while the local path uses a high dilation rate (4,2,1) for decremental even numbers to tackle gridding artifacts and to extract both coarse and fine features from the two parallel paths. Using three different types of MRI datasets, the suggested dilated PDCNN with the average ensemble method performs best. The accuracy achieved for the multiclass Kaggle dataset-III, Figshare dataset-II, and binary tumor identification dataset-I is 98.35%, 98.13%, and 98.67%, respectively. In comparison to state-of-the-art techniques, the suggested structure improves results by extracting both fine and coarse features, making it efficient.
- Published
- 2024
- Full Text
- View/download PDF
13. MS-HRNet: multi-scale high-resolution network for human pose estimation.
- Author
-
Wang, Yanxia, Wang, Renjie, Shi, Hu, and Liu, Dan
- Subjects
- *
POSE estimation (Computer vision) , *PARKINSON'S disease , *PARAMETERIZATION , *AUTISTIC children , *AUTISM in children , *HUMAN-computer interaction , *DEEP learning - Abstract
Human pose estimation has important applications in medical diagnosis (such as early diagnosis of autism in children and assisting with the diagnosis of Parkinson's disease), human-computer interaction, animation, and other fields. Currently, many human pose estimation algorithms are based on deep learning. However, most research focuses only on increasing the depth and width of the network model. This approach overlooks that merely enlarging the network's depth and width results in excessive parameterization, without enhancing the model's effective receptive field or its ability to extract multi-scale features. Hence, this paper constructs a network model, named MS-HRNet (Multi-Scale High-Resolution Network), for human pose estimation. Specifically, we propose a more concise and efficient version of HRNet framework as the backbone network of MS-HRNet. This addresses the challenges of HRNet complex structure and large number of parameters that cause training difficulties, and its inadequacy in handling multi-scale information. Additionally, we designed a multi-scale convolutional kernel parallel module named MSBlock (Multi-Scale Block) as the basic block of MS-HRNet. By introducing coordinate attention modules and ASFF (Adaptive Spatial Feature Fusion) modules, the model's ability to extract information is effectively increased, and the issue of feature conflict during the fusion of features with different resolutions is resolved, with only a small increase in the number of model parameters. To evaluate the effectiveness of the proposed model, we conducted comparison experiment and ablation experiments using popular human pose estimation datasets, including COCO2017 and MPII, against multiple existing human pose estimation models.On the COCO 2017 dataset, the number of MS-HRNet parameters are decreased by 41% than the baseline model HRNet, the computational complexity by 59%, and the detection accuracies(mAP) are increased by 2.4 point. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. 基于深度学习方法的传送带缺陷检测.
- Author
-
钟信 and 彭力
- Abstract
Copyright of Computer Measurement & Control is the property of Magazine Agency of Computer Measurement & Control and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
15. A Lightweight CER-YOLOv5s Algorithm for Detection of Construction Vehicles at Power Transmission Lines.
- Author
-
Yu, Pingping, Yan, Yuting, Tang, Xinliang, Shang, Yan, and Su, He
- Subjects
ELECTRIC lines ,FEATURE extraction ,PYRAMIDS ,ALGORITHMS - Abstract
In the context of power-line scenarios characterized by complex backgrounds and diverse scales and shapes of targets, and addressing issues such as large model parameter sizes, insufficient feature extraction, and the susceptibility to missing small targets in engineering-vehicle detection tasks, a lightweight detection algorithm termed CER-YOLOv5s is firstly proposed. The C3 module was restructured by embedding a lightweight Ghost bottleneck structure and convolutional attention module, enhancing the model's ability to extract key features while reducing computational costs. Secondly, an E-BiFPN feature pyramid network is proposed, utilizing channel attention mechanisms to effectively suppress background noise and enhance the model's focus on important regions. Bidirectional connections were introduced to optimize the feature fusion paths, improving the efficiency of multi-scale feature fusion. At the same time, in the feature fusion part, an ERM (enhanced receptive module) was added to expand the receptive field of shallow feature maps through multiple convolution repetitions, enhancing the global information perception capability in relation to small targets. Lastly, a Soft-DIoU-NMS suppression algorithm is proposed to improve the candidate box selection mechanism, addressing the issue of suboptimal detection of occluded targets. The experimental results indicated that compared with the baseline YOLOv5s algorithm, the improved algorithm reduced parameters and computations by 27.8% and 31.9%, respectively. The mean average precision (mAP) increased by 2.9%, reaching 98.3%. This improvement surpasses recent mainstream algorithms and suggests stronger robustness across various scenarios. The algorithm meets the lightweight requirements for embedded devices in power-line scenarios. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. 改进YOLOX的夜间安全帽检测算法.
- Author
-
韩贵金, 王瑞萱, 徐午言, and 李 君
- Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
17. Receptive Field Space for Point Cloud Analysis.
- Author
-
Jiang, Zhongbin, Tao, Hai, and Liu, Ye
- Subjects
- *
POINT cloud , *CONVOLUTIONAL neural networks , *IMAGE processing - Abstract
Similar to convolutional neural networks for image processing, existing analysis methods for 3D point clouds often require the designation of a local neighborhood to describe the local features of the point cloud. This local neighborhood is typically manually specified, which makes it impossible for the network to dynamically adjust the receptive field's range. If the range is too large, it tends to overlook local details, and if it is too small, it cannot establish global dependencies. To address this issue, we introduce in this paper a new concept: receptive field space (RFS). With a minor computational cost, we extract features from multiple consecutive receptive field ranges to form this new receptive field space. On this basis, we further propose a receptive field space attention mechanism, enabling the network to adaptively select the most effective receptive field range from RFS, thus equipping the network with the ability to adjust granularity adaptively. Our approach achieved state-of-the-art performance in both point cloud classification, with an overall accuracy (OA) of 94.2%, and part segmentation, achieving an mIoU of 86.0%, demonstrating the effectiveness of our method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Multi-scale context fusion network for melanoma segmentation.
- Author
-
Zhenhua Li and Lei Zhang
- Abstract
Aiming at the problems that the edge of melanoma image is fuzzy, the contrast with the background is low, and the hair occlusion makes it difficult to segment accurately, this paper proposes a model MSCNet for melanoma segmentation based on U-net frame. Firstly, a multi-scale pyramid fusion module is designed to reconstruct the skip connection and transmit global information to the decoder. Secondly, the contextural information conduction module is innovatively added to the top of the encoder. The module provides different receptive fields for the segmented target by using the hole convolution with different expansion rates, so as to better fuse multi-scale contextural information. In addition, in order to suppress redundant information in the input image and pay more attention to melanoma feature information, global channel attention mechanism is introduced into the decoder. Finally, In order to solve the problem of lesion class imbalance, this paper uses a combined loss function. The algorithm of this paper is verified on ISIC 2017 and ISIC 2018 public datasets. The experimental results indicate that the proposed algorithm has better accuracy for melanoma segmentation compared with other CNN-based image segmentation algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Stable 3D Deep Convolutional Autoencoder Method for Ultrasonic Testing of Defects in Polymer Composites.
- Author
-
Liu, Yi, Yu, Qing, Liu, Kaixin, Zhu, Ningtao, and Yao, Yuan
- Subjects
- *
POLYMER testing , *ULTRASONIC imaging , *SURFACE defects , *ULTRASONIC testing , *ECHO - Abstract
Ultrasonic testing is widely used for defect detection in polymer composites owing to advantages such as fast processing speed, simple operation, high reliability, and real-time monitoring. However, defect information in ultrasound images is not easily detectable because of the influence of ultrasound echoes and noise. In this study, a stable three-dimensional deep convolutional autoencoder (3D-DCA) was developed to identify defects in polymer composites. Through 3D convolutional operations, it can synchronously learn the spatiotemporal properties of the data volume. Subsequently, the depth receptive field (RF) of the hidden layer in the autoencoder maps the defect information to the original depth location, thereby mitigating the effects of the defect surface and bottom echoes. In addition, a dual-layer encoder was designed to improve the hidden layer visualization results. Consequently, the size, shape, and depth of the defects can be accurately determined. The feasibility of the method was demonstrated through its application to defect detection in carbon-fiber-reinforced polymers. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. 基于优化感受野策略的图像修复方法.
- Author
-
刘恩泽, 刘华明, 王秀友, and 毕学慧
- Abstract
The currently popular image inpainting methods based on deep neural network typically employ large receptive field feature extractors. However, when restoring local patterns and textures, they often generate artifacts or distorted textures, thus failing to recover the overall semantic and visual structure of the image. To address this issue, this paper proposed a novel image inpainting method, called ORFNet, which combined coarse and fine inpainting by employing an optimized receptive field strategy. Initially, it obtained a coarse inpainting result by using a generative adversarial network with a large receptive field. Subsequently, it used a model with a small receptive field to refine local texture details. Finally, it performed a global refinement inpainting by using an encoder-decoder network based on attention mechanisms. Validation on the CelebA, Paris StreetView, and Places2 datasets demonstrates that ORFNet outperforms existing representative inpainting methods. It leads to 1.98 dB increase in PSNR and 2.49% improvement in SSIM, along with average 2.4% reduction in LPIPS. Experimental results confirm the effectiveness of the proposed image inpainting method, showcasing superior performance across various receptive field settings and achieving more realistic and natural visual outcome. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Alpha-2 nicotinic acetylcholine receptors regulate spectral integration in auditory cortex
- Author
-
Irakli Intskirveli, Susan Gil, Ronit Lazar, and Raju Metherate
- Subjects
nicotine ,mouse ,receptive field ,electrophysiology ,current-source density ,neuromodulation ,Neurosciences. Biological psychiatry. Neuropsychiatry ,RC321-571 - Abstract
IntroductionIn primary auditory cortex (A1), nicotinic acetylcholine receptors (nAChRs) containing α2 subunits are expressed in layer 5 Martinotti cells (MCs)—inhibitory interneurons that send a main axon to superficial layers to inhibit distal apical dendrites of pyramidal cells (PCs). MCs also contact interneurons in supragranular layers that, in turn, inhibit PCs. Thus, MCs may regulate PCs via inhibition and disinhibition, respectively, of distal and proximal apical dendrites. Auditory inputs to PCs include thalamocortical inputs to middle layers relaying information about characteristic frequency (CF) and near-CF stimuli, and intracortical long-distance (“horizontal”) projections to multiple layers carrying information about spectrally distant (“nonCF”) stimuli. CF and nonCF inputs integrate to create broad frequency receptive fields (RFs). Systemic administration of nicotine activates nAChRs to “sharpen” RFs—to increase gain within a narrowed RF—resulting in enhanced responses to CF stimuli and reduced responses to nonCF stimuli. While nicotinic mechanisms to increase gain have been identified, the mechanism underlying RF narrowing is unknown.MethodsHere, we examine the role of α2 nAChRs in mice with α2 nAChR-expressing neurons labeled fluorescently, and in mice with α2 nAChRs genetically deleted.ResultsThe distribution of fluorescent neurons in auditory cortex was consistent with previous studies demonstrating α2 nAChRs in layer 5 MCs, including nonpyramidal somata in layer 5 and dense processes in layer 1. We also observed label in subcortical auditory regions, including processes, but no somata, in the medial geniculate body, and both fibers and somata in the inferior colliculus. Using electrophysiological (current-source density) recordings in α2 nAChR knock-out mice, we found that systemic nicotine failed to enhance CF-evoked inputs to layer 4, suggesting a role for subcortical α2 nAChRs, and failed to reduce nonCF-evoked responses, suggesting that α2 nAChRs regulate horizontal projections to produce RF narrowing.DiscussionThe results support the hypothesis that α2 nAChRs function to simultaneously enhance RF gain and narrow RF breadth in A1. Notably, a similar neural circuit may recur throughout cortex and hippocampus, suggesting widespread conserved functions regulated by α2 nAChRs.
- Published
- 2024
- Full Text
- View/download PDF
22. YOLO-BS: A Better Object Detection Model for Real-Time Driver Behavior Detection
- Author
-
Xi, Yang, Guo, Jinxin, Ma, Ming, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Huang, De-Shuang, editor, Pan, Yijie, editor, and Guo, Jiayang, editor
- Published
- 2024
- Full Text
- View/download PDF
23. A Novel Facial Expression Recognition (FER) Model Using Multi-scale Attention Network
- Author
-
Ghadai, Chakrapani, Patra, Dipti, Okade, Manish, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Kaur, Harkeerat, editor, Jakhetiya, Vinit, editor, Goyal, Puneet, editor, Khanna, Pritee, editor, Raman, Balasubramanian, editor, and Kumar, Sanjeev, editor
- Published
- 2024
- Full Text
- View/download PDF
24. FCGAN: Spectral Convolutions via FFT for Channel-Wide Receptive Field in Generative Adversarial Networks
- Author
-
Gomes, Pedro H. B., Santos, Luiz Fernando, Gattass, Marcelo, Rannenberg, Kai, Editor-in-Chief, Soares Barbosa, Luís, Editorial Board Member, Carette, Jacques, Editorial Board Member, Tatnall, Arthur, Editorial Board Member, Neuhold, Erich J., Editorial Board Member, Stiller, Burkhard, Editorial Board Member, Stettner, Lukasz, Editorial Board Member, Pries-Heje, Jan, Editorial Board Member, Kreps, David, Editorial Board Member, Rettberg, Achim, Editorial Board Member, Furnell, Steven, Editorial Board Member, Mercier-Laurent, Eunika, Editorial Board Member, Winckler, Marco, Editorial Board Member, Malaka, Rainer, Editorial Board Member, Maglogiannis, Ilias, editor, Iliadis, Lazaros, editor, Macintyre, John, editor, Avlonitis, Markos, editor, and Papaleonidas, Antonios, editor
- Published
- 2024
- Full Text
- View/download PDF
25. Perceptive Fields and the Study of Inherited Retinal Degeneration
- Author
-
Rizzi, Matteo, Powell, Kate, Singh, Arun D., Series Editor, Prakash, Gyan, editor, and Iwata, Takeshi, editor
- Published
- 2024
- Full Text
- View/download PDF
26. SAMDConv: Spatially Adaptive Multi-scale Dilated Convolution
- Author
-
Hu, Haigen, Yu, Chenghan, Zhou, Qianwei, Guan, Qiu, Chen, Qi, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Liu, Qingshan, editor, Wang, Hanzi, editor, Ma, Zhanyu, editor, Zheng, Weishi, editor, Zha, Hongbin, editor, Chen, Xilin, editor, Wang, Liang, editor, and Ji, Rongrong, editor
- Published
- 2024
- Full Text
- View/download PDF
27. TRFN: Triple-Receptive-Field Network for Regional-Texture and Holistic-Structure Image Inpainting
- Author
-
Xiao, Qingguo, Han, Zhiyuan, Liu, Zhaodong, Pan, Guangyuan, Zheng, Yanpeng, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Luo, Biao, editor, Cheng, Long, editor, Wu, Zheng-Guang, editor, Li, Hongyi, editor, and Li, Chaojie, editor
- Published
- 2024
- Full Text
- View/download PDF
28. Knowledge Distillation via Information Matching
- Author
-
Zhu, Honglin, Jiang, Ning, Tang, Jialiang, Huang, Xinlei, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Luo, Biao, editor, Cheng, Long, editor, Wu, Zheng-Guang, editor, Li, Hongyi, editor, and Li, Chaojie, editor
- Published
- 2024
- Full Text
- View/download PDF
29. A state-of-the-art survey of deep learning models for automated pavement crack segmentation
- Author
-
Hongren Gong, Liming Liu, Haimei Liang, Yuhui Zhou, and Lin Cong
- Subjects
Pavement maintenance ,Crack detection ,Deep learning ,Semantic segmentation ,Receptive field ,Transportation engineering ,TA1001-1280 - Abstract
Survey of road cracks in a timely, complete, and accurate way is pivotal to pavement maintenance planning. Motivated by the increasingly heavy task of identifying cracks, researchers have developed extensive crack segmentation models based on Deep learning (DL) methods with significantly different levels of accuracy, efficiency, and generalizing capacity. Although many of the models provide satisfying detection performance, why these models work still needs to be determined. The objective of this study is to survey recent advances in automated DL crack recognition and provide evidence for their underlying working mechanism. We first reviewed 54 DL crack recognition methods to summarize critical factors in these models. Then, we conducted a performance evaluation of fourteen famous semantic segmentation models using the quantitative metrics: F-1 score and mIoU. Then, the effective receptive field and class activation map of the included models are visualized to demonstrate the training results as qualitative evaluation. Based on the literature review and comparison results, larger kernel size, feature fusion, and attention module all contribute to the improvement of model performance. Striking a balance between increasing the effective receptive field and computational/memory efficiency is the key to designing DL crack segmentation models. Finally, some potential directions and suggestions for future development are provided, such as developing semi-supervised or unsupervised learning for the high cost of pixel-level labeling.
- Published
- 2024
- Full Text
- View/download PDF
30. Marr's three levels of analysis are useful as a framework for neuroscience.
- Author
-
Lengyel, Máté
- Subjects
- *
ACTION potentials , *NEUROSCIENCES , *DENDRITES , *NEURAL circuitry , *VISION - Published
- 2024
- Full Text
- View/download PDF
31. 基于注意力机制和图像轮廓的实例分割算法.
- Author
-
顾登华 and 顾春华
- Abstract
Based on image contour, the instance segmentation method uses fewer contour nodes to represent an object, which effectively reduces the number of algorithmic parameters and improves its operation efficiency. However, with the segmentation result of poor quality, it is no match for traditional pixel-by-pixel processing segmentation algorithm in terms of accuracy. To improve the accuracy of the algorithm, it is of great necessity to introduce a refined model of the instance segmentation (Attend the Contour snake, AC-snake), which is based on image contour with a combination of attention mechanism. An improved Largekernel + is added to the backbone network to improve the receptive field of the model and extract richer feature information. The network structure at the contour vertex deformation stage is improved, and the Dual Channel attention (DC - attentio) module is combined to enhance the effective information of contour vertex, reduce the invalid parameters in the training network, and improve the detection accuracy and training speed. The experimental results show that in Cityscapes validation data set, the improved model proposed in this study has improved performance when compared with the original model. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Optimization of segmentation model based on maximization information fusion and its application in nuclear image analysis.
- Author
-
Xiong, Feiyan and Wei, Yun
- Subjects
- *
IMAGE segmentation , *HEMATOXYLIN & eosin staining , *IMAGE analysis , *IMAGE fusion - Abstract
The Whole Slide Image (WSI) is a pathological image with Hematoxylin & Eosin staining. The low-contrast color staining will bring a challenge on analysis. We propose SNSeg (Staining Nuclear Segmentation) to improve the segmentation performance in WSI, for obtaining accurate nuclear region. At the macro level, we reconstructed the feature fusion mode and connection path, for reducing semantic loss in the gradient descent. At the micro level, first, we design a multiple receptive field convolution unit (RFC), and it can adjust the receptive field for adapting to the nuclei size of the input image. Secondly, for efficiently fusing the feature information extracted from the encoder, we design a multi-branch channel attention fusion unit (MCA), which integrates different branch information flows in channel-wise to a unified module. Finally, we design parallel outputting decoder fusion (DF) module to fuse outputting spatial attention for generating the final segmentation results. In addition, we introduce the watershed based on distance transformation to separate adherent nuclei and mark contours. We design experiments for verifying SNSeg on public datasets of MoNuSeg, TNBC, and PanNuKe. The segmentation results on MoNuSeg show that the SNSeg has achieves an accuracy of 84.32% and a Dice score of 81.21%. Compared with other networks, the SNSeg have competitive advantages in segmentation performance and network parameters. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. 基于改进 YOLOv5s 的车载人员 安全带行为检测.
- Author
-
焦波, 焦良葆, 吴继薇, 祝阳, and 高阳
- Abstract
Copyright of Computer Measurement & Control is the property of Magazine Agency of Computer Measurement & Control and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
34. 基于自适应多尺度特征融合的X光违禁品检测.
- Author
-
张 良 and 薛志诚
- Abstract
Copyright of Journal of Signal Processing is the property of Journal of Signal Processing and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
35. Local field potentials, spiking activity, and receptive fields in human visual cortex.
- Author
-
Luo, Lu, Wang, Xiongfei, Lu, Junshi, Chen, Guanpeng, Luan, Guoming, Li, Wu, Wang, Qian, and Fang, Fang
- Abstract
The concept of receptive field (RF) is central to sensory neuroscience. Neuronal RF properties have been substantially studied in animals, while those in humans remain nearly unexplored. Here, we measured neuronal RFs with intracranial local field potentials (LFPs) and spiking activity in human visual cortex (V1/V2/V3). We recorded LFPs via macro-contacts and discovered that RF sizes estimated from low-frequency activity (LFA, 0.5–30 Hz) were larger than those estimated from low-gamma activity (LGA, 30–60 Hz) and high-gamma activity (HGA, 60–150 Hz). We then took a rare opportunity to record LFPs and spiking activity via microwires in V1 simultaneously. We found that RF sizes and temporal profiles measured from LGA and HGA closely matched those from spiking activity. In sum, this study reveals that spiking activity of neurons in human visual cortex could be well approximated by LGA and HGA in RF estimation and temporal profile measurement, implying the pivotal functions of LGA and HGA in early visual information processing. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. Dynamic receptive field adaptation for scene text recognition.
- Author
-
Tian, Shu, Zhu, Kang-Xi, Qin, Hai-Bo, and Yang, Chun
- Subjects
- *
TEXT recognition , *TRANSFORMER models - Abstract
Scene text recognition methods of the Encoder–Decoder framework, generally assume that the proportion of characters in the same text instance are basically the same. However, this assumption does not always hold in the context of complex scene images. For adaptively revising the receptive field according to the different font in the scene text image, we propose a Dynamic Receptive Field Adaption Framework which consists of Memory Attention (MA) module and Dynamic Feature Adaptive (DFA) module. MA percepts the historical location information to adapt to the change of character position in the decoder. DFA selects the most distinguishing features from feature maps of different levels dynamically. Additionally, MA and DFA can be easily extended to the existing attention-based and transformer-based text recognition methods to improve their performance. With extension experiments on public benchmark datasets, including IIIT-5K, SVT, SVTP, CUTE80, RECTS, LSVT, and RCTW, our method has shown effectiveness and robustness. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. 基于注意力机制和多空间金字塔池化的 实时目标检测算法.
- Author
-
王国刚, 李泽欣, and 董志豪
- Abstract
Copyright of Computer Measurement & Control is the property of Magazine Agency of Computer Measurement & Control and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
38. 基于改进YOLOv5 的带式输送机大块煤检测.
- Author
-
秦宇龙, 程继明, 任一个, 王晓晴, 赵青, and 安翠娟
- Abstract
Copyright of Journal of Mine Automation is the property of Industry & Mine Automation Editorial Department and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
39. 基于改进YOLOv5 算法的红外图像行人目标检测.
- Author
-
高正中, 于明沆, 孟晗, and 殷秀程
- Abstract
Copyright of China Sciencepaper is the property of China Sciencepaper and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
40. MfvPose: A multi-scale hybrid framework for human pose estimation.
- Author
-
Ran, Lang, Hong, Chaoqun, Zhang, Xuebai, Tang, Chaohui, and Xie, Yuhong
- Subjects
- *
TRANSFORMER models , *MATRIX multiplications , *HUMAN beings - Abstract
Human pose estimation is a challenging visual task that relies on spatial location information. To improve the performance of human pose estimation, it is important to accurately determine the constraint relationship among keypoints. To address this, we propose MfvPose, a novel hybrid model that leverages rich multi-scale information. The proposed model incorporates the HRFOV module, which uses cascaded atrous convolution to maintain high-resolution representations of the backbone extractor and enrich the multi-scale information. In addition, we introduce learnable scalar weights to the Transformer encoder. In detail, it involves a multiplication by a diagonal matrix with learnable scalar weights on output of each residual block, which improves the dynamics of model training and enhances the accuracy of human pose estimation. It is experimentally shown that our proposed MfvPose achieves promising results on various benchmarks. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
41. 基于通道特征金字塔的图像分割算法.
- Author
-
孙红, 杨晨, and 莫光萍
- Abstract
In view of the problems of huge parameter calculation cost and redundant parameters in semantic segmentation tasks, this study proposes a channel feature pyramid module to solve this problem. Based on the channel feature pyramid module and a lightweight attention mechanism, a real time semantic segmentation network is constructed. The channel feature pyramid module creates sufficient receptive field and densely utilizes context information, and gradually combines feature maps with summation operations starting from the second channel, and concatenates them to build the final hierarchical feature map, which is used in regular convolutional layers. The attention mechanism of the convolution module is added later to improve the segmentation accuracy. Without any pre - training and post - processing, the algorithm achieves a segmentation accuracy of 68.1% on the CamVid data set using only 0.75 MB parameters and 5.3 MB memory on a single GTX2080Ti, and 56 frames on the Cityscapes data set. The inference speed achieved an average interaction ratio of 75.7%. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. YOLO-ERF: lightweight object detector for UAV aerial images.
- Author
-
Wang, Xin, He, Ning, Hong, Chen, Sun, Fengxi, Han, Wenjing, and Wang, Qi
- Subjects
- *
OBJECT recognition (Computer vision) , *DRONE aircraft , *DETECTORS , *COMPUTER vision , *PERSONAL names - Abstract
The application of object detection techniques in the field of unmanned aerial vehicles (UAVs) is an important research direction in computer vision. Because object detection in UAV aerial images needs to meet real-time requirements, a challenging problem in this technology is the trade-off between network parameters and detection accuracy. To solve this problem, this paper proposes a lightweight object detector family named YOLO-ERF. First, this paper proposes the effective receptive field (ERF) module, which can increase the convolutional kernel receptive field while preserving local details. The ERF module is then used to design a lightweight backbone to expand the network receptive field without the need for attaching additional context modules after the backbone to expand the receptive field. In addition, the proposed detectors use the ERF module to critically optimize the path aggregation network structure to improve accuracy with reduced network parameters. Finally, a lightweight detection head is proposed to improve small object recognition in complex backgrounds. With these optimizations, the YOLO-ERF models in this paper achieved a better trade-off between accuracy and parameters than other mainstream models, achieving strong results on the VisDrone and COCO datasets. YOLO-ERF-T reduced the number of network parameters by 40.3% when compared with YOLOv7-Tiny while increasing the average accuracy by 2.4% and 1.9%, respectively, in VisDrone and COCO datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. Improved YOLOv5s Traffic Sign Detection.
- Author
-
Xiaoming Zhang and Ying Tia
- Subjects
- *
TRAFFIC signs & signals , *TRAFFIC monitoring - Abstract
Aiming at the small proportion of traffic signs in natural scenes, fuzzy and complex, and the problems of low detection accuracy, missed detection, and false detection in current traffic sign detection algorithms, a traffic sign detection algorithm based on YOLOv5s was proposed. Firstly, the Regional Feature Enhancement Module (RFEM) is presented, which uses dilated convolution with different dilated rates and 1×1 convolution to expand the receptive field and change the feature dimension. The feature fusion is carried out by adding a method to increase each dimension information of the image. Improve the final classification accuracy of the model. Secondly, a 160×160 size detection layer was added to the detection layer of the original algorithm, and the feature fusion was performed with the local small target information extracted from the backbone network to increase the detection accuracy of small targets. Finally, K-means++ was used to recluster the initial anchor box, which accelerated the convergence speed of the model, reduced the border loss, and improved the detection accuracy of the model. The experimental results show that the improved algorithm has achieved 90.10%Precision, 82.36%Recall, and 87.98%mAP, on the TT100K dataset. Compared with the original YOLOv5s algorithm, the improved YOLOv5s algorithm has improved the accuracy of the algorithm. It increased by 7.89%, 5.05%, and 4.36%, respectively. This method can be effectively applied for traffic sign detection. [ABSTRACT FROM AUTHOR]
- Published
- 2023
44. Learning discrete adaptive receptive fields for graph convolutional networks.
- Author
-
Ma, Xiaojun, Li, Ziyao, Song, Guojie, and Shi, Chuan
- Abstract
Different nodes in a graph neighborhood generally yield different importance. In previous work of graph convolutional networks (GCNs), such differences are typically modeled with attention mechanisms. However, as we prove in our paper, soft attention weights suffer from undesired smoothness large neighborhoods (not to be confused with the oversmoothing effect in deep GCNs). To address this weakness, we introduce a novel framework of conducting graph convolutions, where nodes are discretely selected among multi-hop neighborhoods to construct adaptive receptive fields (ARFs). ARFs enable GCNs to get rid of the smoothness of soft attention weights, as well as to efficiently explore long-distance dependencies in graphs. We further propose GRARF (GCN with reinforced adaptive receptive fields) as an instance, where an optimal policy of constructing ARFs is learned with reinforcement learning. GRARF achieves or matches state-of-the-art performances on public datasets from different domains. Our further analysis corroborates that GRARF is more robust than attention models against neighborhood noises. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. Training Tricks for Steel Microstructure Segmentation with Deep Learning.
- Author
-
Ma, Xudong and Yu, Yunhe
- Subjects
DEEP learning ,DATA augmentation ,MICROSTRUCTURE ,FERRITES ,CARBIDES - Abstract
Data augmentation and other training techniques have improved the performance of deep learning segmentation methods for steel materials. However, these methods often depend on the dataset and do not provide general principles for segmenting different microstructural morphologies. In this work, we collected 64 granular carbide images (2048 × 1536 pixels) and 26 blocky ferrite images (2560 × 1756 pixels). We used five carbide images and two ferrite images and derived from them the test set to investigate the influence of frequently used training techniques on model segmentation accuracy. We propose a novel method for quickly building models that achieve the highest segmentation accuracy for a given dataset through combining multiple training techniques that enhance the segmentation quality. This method leads to a 1–2.5% increase in mIoU values. We applied the optimal models to the quantization of carbides. The results show that the optimal models achieve the smallest errors of 5.39 nm for the mean radius and 29 for the total number of carbides on the test set. The segmentation results are also more reasonable than those of traditional segmentation methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. 层级特征交互与增强感受野双分支遥感图像 去雾网络.
- Author
-
孙航, 方帅领, 但志平, 任东, 余梅, and 孙水发
- Subjects
DEEP learning ,REMOTE sensing - Abstract
Copyright of Journal of Remote Sensing is the property of Editorial Office of Journal of Remote Sensing & Science Publishing Co. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
47. Inhibition, but not excitation, recovers from partial cone loss with greater spatiotemporal integration, synapse density, and frequency
- Author
-
Lee, Joo Yeun, Care, Rachel A, Kastner, David B, Della Santina, Luca, and Dunn, Felice A
- Subjects
Biological Sciences ,Neurosciences ,Eye Disease and Disorders of Vision ,Underpinning research ,1.1 Normal biological development and functioning ,Neurological ,Animals ,Mice ,Retina ,Retinal Cone Photoreceptor Cells ,Retinal Ganglion Cells ,Retinal Rod Photoreceptor Cells ,Synapses ,Visual Pathways ,cellular and systems neuroscience ,degeneration ,disease ,ganglion cells ,homeostatic plasticity ,perception ,photoreceptors ,receptive field ,retina ,vision ,Biochemistry and Cell Biology ,Medical Physiology ,Biological sciences - Abstract
Neural circuits function in the face of changing inputs, either caused by normal variation in stimuli or by cell death. To maintain their ability to perform essential computations with partial inputs, neural circuits make modifications. Here, we study the retinal circuit's responses to changes in light stimuli or in photoreceptor inputs by inducing partial cone death in the mature mouse retina. Can the retina withstand or recover from input loss? We find that the excitatory pathways exhibit functional loss commensurate with cone death and with some aspects predicted by partial light stimulation. However, inhibitory pathways recover functionally from lost input by increasing spatiotemporal integration in a way that is not recapitulated by partially stimulating the control retina. Anatomically, inhibitory synapses are upregulated on secondary bipolar cells and output ganglion cells. These findings demonstrate the greater capacity for inhibition, compared with excitation, to modify spatiotemporal processing with fewer cone inputs.
- Published
- 2022
48. A Lightweight CER-YOLOv5s Algorithm for Detection of Construction Vehicles at Power Transmission Lines
- Author
-
Pingping Yu, Yuting Yan, Xinliang Tang, Yan Shang, and He Su
- Subjects
power line ,YOLOv5s ,lightweight network ,bidirectional feature pyramid ,attention mechanism ,receptive field ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
In the context of power-line scenarios characterized by complex backgrounds and diverse scales and shapes of targets, and addressing issues such as large model parameter sizes, insufficient feature extraction, and the susceptibility to missing small targets in engineering-vehicle detection tasks, a lightweight detection algorithm termed CER-YOLOv5s is firstly proposed. The C3 module was restructured by embedding a lightweight Ghost bottleneck structure and convolutional attention module, enhancing the model’s ability to extract key features while reducing computational costs. Secondly, an E-BiFPN feature pyramid network is proposed, utilizing channel attention mechanisms to effectively suppress background noise and enhance the model’s focus on important regions. Bidirectional connections were introduced to optimize the feature fusion paths, improving the efficiency of multi-scale feature fusion. At the same time, in the feature fusion part, an ERM (enhanced receptive module) was added to expand the receptive field of shallow feature maps through multiple convolution repetitions, enhancing the global information perception capability in relation to small targets. Lastly, a Soft-DIoU-NMS suppression algorithm is proposed to improve the candidate box selection mechanism, addressing the issue of suboptimal detection of occluded targets. The experimental results indicated that compared with the baseline YOLOv5s algorithm, the improved algorithm reduced parameters and computations by 27.8% and 31.9%, respectively. The mean average precision (mAP) increased by 2.9%, reaching 98.3%. This improvement surpasses recent mainstream algorithms and suggests stronger robustness across various scenarios. The algorithm meets the lightweight requirements for embedded devices in power-line scenarios.
- Published
- 2024
- Full Text
- View/download PDF
49. Receptive Field Space for Point Cloud Analysis
- Author
-
Zhongbin Jiang, Hai Tao, and Ye Liu
- Subjects
point cloud ,receptive field ,attention ,Chemical technology ,TP1-1185 - Abstract
Similar to convolutional neural networks for image processing, existing analysis methods for 3D point clouds often require the designation of a local neighborhood to describe the local features of the point cloud. This local neighborhood is typically manually specified, which makes it impossible for the network to dynamically adjust the receptive field’s range. If the range is too large, it tends to overlook local details, and if it is too small, it cannot establish global dependencies. To address this issue, we introduce in this paper a new concept: receptive field space (RFS). With a minor computational cost, we extract features from multiple consecutive receptive field ranges to form this new receptive field space. On this basis, we further propose a receptive field space attention mechanism, enabling the network to adaptively select the most effective receptive field range from RFS, thus equipping the network with the ability to adjust granularity adaptively. Our approach achieved state-of-the-art performance in both point cloud classification, with an overall accuracy (OA) of 94.2%, and part segmentation, achieving an mIoU of 86.0%, demonstrating the effectiveness of our method.
- Published
- 2024
- Full Text
- View/download PDF
50. MAIM-VO: A Robust Visual Odometry with Mixed MLP for Weak Textured Environment
- Author
-
Shen, Zhiwei, Kong, Bin, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Yongtian, Wang, editor, and Lifang, Wu, editor
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.