135 results on '"Yi-Ping Hung"'
Search Results
2. 3D Video Stabilization with Depth Estimation by CNN-based Optimization
- Author
-
Yi-Ping Hung, Yao-Chih Lee, Kuan-Wei Tseng, Chien-Cheng Chen, Chu-Song Chen, and Yu-Ta Chen
- Subjects
business.industry ,Computer science ,Frame (networking) ,Feature extraction ,3D reconstruction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Stability (learning theory) ,Visualization ,Image stabilization ,Robustness (computer science) ,Computer vision ,Artificial intelligence ,business ,Smoothing - Abstract
Video stabilization is an essential component of visual quality enhancement. Early methods rely on feature tracking to recover either 2D or 3D frame motion, which suffer from the robustness of local feature extraction and tracking in shaky videos. Recently, learning-based methods seek to find frame transformations with high-level information via deep neural networks to overcome the robustness issue of feature tracking. Nevertheless, to our best knowledge, no learning-based methods leverage 3D cues for the transformation inference yet; hence they would lead to artifacts on complex scene-depth scenarios. In this paper, we propose Deep3D Stabilizer, a novel 3D depth-based learning method for video stabilization. We take advantage of the recent self-supervised framework on jointly learning depth and camera ego-motion estimation on raw videos. Our approach requires no data for pre-training but stabilizes the input video via 3D reconstruction directly. The rectification stage incorporates the 3D scene depth and camera motion to smooth the camera trajectory and synthesize the stabilized video. Unlike most one-size-fits-all learning-based methods, our smoothing algorithm allows users to manipulate the stability of a video efficiently. Experimental results on challenging benchmarks show that the proposed solution consistently outperforms the state-of-the-art methods on almost all motion categories.
- Published
- 2021
- Full Text
- View/download PDF
3. Camera Ego-Positioning Using Sensor Fusion and Complementary Method
- Author
-
Yan Bin Song, Tian Yi Shen, Peng Yuan Kao, Kuan Wei Tseng, Shih Wei Hu, Yi-Ping Hung, Kuan-Wen Chen, and Sheng-Wen Shih
- Subjects
0209 industrial biotechnology ,ComputingMethodologies_SIMULATIONANDMODELING ,GeneralLiterature_INTRODUCTORYANDSURVEY ,Computer science ,business.industry ,Visual positioning ,02 engineering and technology ,Simultaneous localization and mapping ,Sensor fusion ,Tracking (particle physics) ,020901 industrial engineering & automation ,Inertial measurement unit ,Feature (computer vision) ,0202 electrical engineering, electronic engineering, information engineering ,Fuse (electrical) ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,business - Abstract
Visual simultaneous localization and mapping (SLAM) is a common solution for camera ego-positioning. However, SLAM sometimes loses tracking, for instance due to fast camera motion or featureless or repetitive environments. To account for the limitations of visual SLAM, we use sensor fusion method to fuse the visual positioning results with inertial measurement unit (IMU) data based on filter-based, loosely-coupled sensor fusion methods, and further combines feature-based SLAM with direct SLAM via proposed complementary fusion to retain the advantages of both methods; i.e., we not only keep the accurate positioning of feature-based SLAM but also account for its difficulty with featureless scenes by direct SLAM. Experimental results show that the proposed complementary method improves the positioning accuracy of conventional vision-only SLAM and leads to more robust positioning results.
- Published
- 2021
- Full Text
- View/download PDF
4. One-Handed Input Through Rotational Motion for Smartwatches
- Author
-
Po Chang Chen, Hsin-Ruey Tsai, Li-Wei Chan, and Yi-Ping Hung
- Subjects
business.industry ,Computer science ,05 social sciences ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Rotation around a fixed axis ,Human Factors and Ergonomics ,Computer Science Applications ,Human-Computer Interaction ,Smartwatch ,0502 economics and business ,050211 marketing ,0501 psychology and cognitive sciences ,Computer vision ,Artificial intelligence ,business ,050107 human factors ,Motion sensors ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
One-handed input for smartwatches is crucial when users’ hands are occupied. A rotational motion is leveraged, which can be detected by built-in motion sensors in most smartwatches, as input to pro...
- Published
- 2018
- Full Text
- View/download PDF
5. An Ensemble of Invariant Features for Person Reidentification
- Author
-
Yi-Ping Hung, Shen-Chi Chen, Young-Gun Lee, and Jenq-Neng Hwang
- Subjects
Normalization (statistics) ,Training set ,business.industry ,Computer science ,Feature extraction ,020207 software engineering ,Pattern recognition ,02 engineering and technology ,Mixture model ,Convolutional neural network ,Visualization ,Robustness (computer science) ,0202 electrical engineering, electronic engineering, information engineering ,Media Technology ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,Invariant (mathematics) ,business - Abstract
This paper proposes an ensemble of invariant features (EIFs), which can properly handle the variations of color difference and human poses/viewpoints for matching pedestrian images observed in different cameras with nonoverlapping field of views. Our proposed method is a direct reidentification (re-id) method, which requires no prior domain learning based on prelabeled corresponding training data. The novel features consist of the holistic and region-based features. The holistic features are extracted by using a publicly available pretrained deep convolutional neural network used in generic object classification. In contrast, the region-based features are extracted based on our proposed two-way Gaussian mixture model fitting, which overcomes the self-occlusion and pose variations. To make a better generalization during recognizing identities without additional learning, the ensemble scheme aggregates all the feature distances using the similarity normalization. The proposed framework achieves robustness against partial occlusion, pose, and viewpoint changes. Moreover, the evaluation results show that our method outperforms the state-of-the-art direct re-id methods on the challenging benchmark viewpoint invariant pedestrian recognition and 3D people surveillance data sets.
- Published
- 2017
- Full Text
- View/download PDF
6. Vision-Based Positioning for Internet-of-Vehicles
- Author
-
Ming-Hsuan Yang, Chun Hsin Wang, Yi-Ping Hung, Chu-Song Chen, Xiao Wei, Qiao Liang, and Kuan-Wen Chen
- Subjects
050210 logistics & transportation ,Engineering ,Vision based ,business.industry ,Hybrid positioning system ,Mechanical Engineering ,05 social sciences ,02 engineering and technology ,Precise Point Positioning ,Computer Science Applications ,Task (project management) ,Data set ,Model compression ,0502 economics and business ,Automotive Engineering ,0202 electrical engineering, electronic engineering, information engineering ,Global Positioning System ,020201 artificial intelligence & image processing ,Computer vision ,The Internet ,Artificial intelligence ,business - Abstract
This paper presents an algorithm for ego-positioning by using a low-cost monocular camera for systems based on the Internet-of-Vehicles. To reduce the computational and memory requirements, as well as the communication load, we tackle the model compression task as a weighted k-cover problem for better preserving the critical structures. For real-world vision-based positioning applications, we consider the issue of large scene changes and introduce a model update algorithm to address this problem. A large positioning data set containing data collected for more than a month, 106 sessions, and 14275 images is constructed. Extensive experimental results show that submeter accuracy can be achieved by the proposed ego-positioning algorithm, which outperforms existing vision-based approaches.
- Published
- 2017
- Full Text
- View/download PDF
7. Abandoned Object Detection via Temporal Consistency Modeling and Back-Tracing Verification for Visual Surveillance
- Author
-
Yi-Ping Hung, Daw-Tung Lin, Shen-Chi Chen, Chu-Song Chen, and Kevin Lin
- Subjects
Pixel ,Computer Networks and Communications ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Cognitive neuroscience of visual object recognition ,Pattern recognition ,Tracing ,Object detection ,Visualization ,Object-class detection ,Video tracking ,Code (cryptography) ,Computer vision ,Artificial intelligence ,Safety, Risk, Reliability and Quality ,business - Abstract
This paper presents an effective approach for detecting abandoned luggage in surveillance videos. We combine short- and long-term background models to extract foreground objects, where each pixel in an input image is classified as a 2-bit code. Subsequently, we introduce a framework to identify static foreground regions based on the temporal transition of code patterns, and to determine whether the candidate regions contain abandoned objects by analyzing the back-traced trajectories of luggage owners. The experimental results obtained based on video images from 2006 Performance Evaluation of Tracking and Surveillance and 2007 Advanced Video and Signal-based Surveillance databases show that the proposed approach is effective for detecting abandoned luggage, and that it outperforms previous methods.
- Published
- 2015
- Full Text
- View/download PDF
8. Large-Area, Multilayered, and High-Resolution Visual Monitoring Using a Dual-Camera System
- Author
-
Yi-Ping Hung, Kuan-Wen Chen, Shen-Chi Chen, Cheng-Wu Chen, and Chih-Wei Lin
- Subjects
Computer Networks and Communications ,Property (programming) ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Process (computing) ,Display device ,Hardware and Architecture ,Camera auto-calibration ,Computer graphics (images) ,Calibration ,Computer vision ,Artificial intelligence ,Zoom ,business ,Stereo camera ,Camera resectioning - Abstract
Large-area, high-resolution visual monitoring systems are indispensable in surveillance applications. To construct such systems, high-quality image capture and display devices are required. Whereas high-quality displays have rapidly developed, as exemplified by the announcement of the 85-inch 4K ultrahigh-definition TV by Samsung at the 2013 Consumer Electronics Show (CES), high-resolution surveillance cameras have progressed slowly and remain not widely used compared with displays. In this study, we designed an innovative framework, using a dual-camera system comprising a wide-angle fixed camera and a high-resolution pan-tilt-zoom (PTZ) camera to construct a large-area, multilayered, and high-resolution visual monitoring system that features multiresolution monitoring of moving objects. First, we developed a novel calibration approach to estimate the relationship between the two cameras and calibrate the PTZ camera. The PTZ camera was calibrated based on the consistent property of distinct pan-tilt angle at various zooming factors, accelerating the calibration process without affecting accuracy; this calibration process has not been reported previously. After calibrating the dual-camera system, we used the PTZ camera and synthesized a large-area and high-resolution background image. When foreground targets were detected in the images captured by the wide-angle camera, the PTZ camera was controlled to continuously track the user-selected target. Last, we integrated preconstructed high-resolution background and low-resolution foreground images captured using the wide-angle camera and the high-resolution foreground image captured using the PTZ camera to generate a large-area, multilayered, and high-resolution view of the scene.
- Published
- 2015
- Full Text
- View/download PDF
9. Viewing-Distance Aware Super-Resolution for High-Definition Display
- Author
-
Ming-Hsuan Yang, Soo-Chang Pei, Yi-Ping Hung, Chih-Tsung Shen, and Hung-Hsun Liu
- Subjects
Deblurring ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Gaussian blur ,Image processing ,Total variation denoising ,Computer Graphics and Computer-Aided Design ,Subjective constancy ,symbols.namesake ,Image Processing, Computer-Assisted ,Photography ,Visual Perception ,symbols ,Animals ,Humans ,Computer vision ,Artificial intelligence ,business ,Image resolution ,Algorithms ,Software ,Image restoration ,ComputingMethodologies_COMPUTERGRAPHICS ,Mathematics ,Feature detection (computer vision) - Abstract
In this paper, we propose a novel algorithm for high-definition displays to enlarge low-resolution images while maintaining perceptual constancy (i.e., the same field-of-view, perceptual blur radius, and the retinal image size in viewer's eyes). We model the relationship between a viewer and a display by considering two main aspects of visual perception, i.e., scaling factor and perceptual blur radius. As long as we enlarge an image while adjust its image blur levels on the display, we can maintain viewer's perceptual constancy. We show that the scaling factor should be set in proportion to the viewing distance and the blur levels on the display should be adjusted according to the focal length of a viewer. Toward this, we first refer to edge directions to interpolate a low-resolution image with the increasing of viewing distance and the scaling factor. After images are interpolated, we utilize a local contrast to estimate the spatially varying image blur levels of the interpolated image. We then further adjust the image blur levels using a parametric deblurring method, which combines L1 as well as L2 reconstruction errors, and Tikhonov with total variation regularization terms. By taking these factors into account, high-resolution images adaptive to viewing distance on a display can be generated. Experimental results on both natural image metric and user subjective studies across image scales demonstrate that the proposed super-resolution algorithm for high-definition displays performs favorably against the state-of-the-art methods.
- Published
- 2015
- Full Text
- View/download PDF
10. Hybrid Method for 3-D Gaze Tracking Using Glint and Contour Features
- Author
-
Chih-Chuan Lai, Sheng-Wen Shih, and Yi-Ping Hung
- Subjects
Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Curvature ,Gaze ,Pupil ,medicine.anatomical_structure ,Cornea ,Media Technology ,medicine ,Eye tracking ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,business - Abstract
Glint features have important roles in gaze-tracking systems. However, when the operation range of a gaze-tracking system is enlarged, the performance of glint-feature-based (GFB) approaches will be degraded mainly due to the curvature variation problem at around the edge of the cornea. Although the pupil contour feature may provide complementary information to help estimating the eye gaze, existing methods do not properly handle the cornea refraction problem, leading to inaccurate results. This paper describes a contour-feature-based (CFB) 3-D gaze-tracking method that is compatible to cornea refraction. We also show that both the GFB and CFB approaches can be formulated in a unified framework and, thus, they can be easily integrated. Furthermore, it is shown that the proposed CFB method and the GFB method should be integrated because the two methods provide complementary information that helps to leverage the strength of both features, providing robustness and flexibility to the system. Computer simulations and real experiments show the effectiveness of the proposed approach for gaze tracking.
- Published
- 2015
- Full Text
- View/download PDF
11. Visual enhancement via reinforcement parameter learning for low backlighted display
- Author
-
Yi-Ping Hung, Ching-Hao Lai, Soo-Chang Pei, and Chih-Tsung Shen
- Subjects
Brightness ,Computer science ,business.industry ,Q-learning ,020206 networking & telecommunications ,02 engineering and technology ,Backlight ,Compensation (engineering) ,Image (mathematics) ,Salience (neuroscience) ,0202 electrical engineering, electronic engineering, information engineering ,Reinforcement learning ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,business ,Reinforcement ,Mobile device - Abstract
In this technical brief, we propose a system to enhance the image on low backlighted display in order to save electrical power for mobile devices. In addition to our brightness compensation, we also take human visual perception, such as just-noticeable difference(JND) and saliency, into account. To integrate these characteristics in a system, we need to adjust several parameters. Accordingly, we introduce our reinforcement parameter learning into the system. By taking actions and analyzing rewards, we can train these characteristic parameters off-line and test the image online. Experimental results show that our visual enhancement via reinforcement parameter learning outperforms the existing systems.
- Published
- 2017
- Full Text
- View/download PDF
12. SegTouch
- Author
-
Hsin-Ruey Tsai, Yi-Ping Hung, Min-Chieh Hsiu, Da-Yuan Huang, Mike Y. Chen, Te-Yen Wu, Bing-Yu Chen, and Jui-Chun Hsiao
- Subjects
Modality (human–computer interaction) ,InformationSystems_INFORMATIONINTERFACESANDPRESENTATION(e.g.,HCI) ,Computer science ,business.industry ,05 social sciences ,020207 software engineering ,02 engineering and technology ,Index finger ,Thumb ,law.invention ,medicine.anatomical_structure ,Touchscreen ,law ,0202 electrical engineering, electronic engineering, information engineering ,medicine ,0501 psychology and cognitive sciences ,Computer vision ,Artificial intelligence ,business ,050107 human factors ,Haptic technology ,Gesture - Abstract
Insufficient input modality on touchscreens causes icons, toolbars and mode switching steps required to perform different functions. Although various methods are proposed to increase touchscreen input modality, touch gestures (e.g., swipe), usually used in touch input, are not provided in previous methods (e.g., Force Touch on iPhone 6s). This still restricts the input modality on touchscreens. Hence, we propose SegTouch to enhance touch input while providing touch gestures. SegTouch uses thumb-to-index-finger gestures, i.e., the thumb slides on the index finger, to define various touch purposes. Based on a pilot study, the middle and base segments on the index finger are suitable input areas for SegTouch. To observe how users leverage the proprioception and natural haptic feedback from index finger landmarks to perform SegTouch, different layouts on the index finger segments were examined in the eyes-free. Including the normal touch without thumb-to-index-finger gesture, SegTouch provides 9 input modality and touch gestures on the screen, so novel applications are enabled.
- Published
- 2017
- Full Text
- View/download PDF
13. Facial Trait Code
- Author
-
Gee-Sern Hsu, Yi-Ping Hung, Tsuhan Chen, and Ping-Han Lee
- Subjects
Boosting (machine learning) ,Face hallucination ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Facial recognition system ,Media Technology ,Three-dimensional face recognition ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,Cluster analysis ,business ,Face detection - Abstract
We propose a facial trait code (FTC) to encode human facial images, and apply it to face recognition. Extracted from an exhaustive set of local patches cropped from a large stack of faces, the facial traits and the associated trait patterns can accurately capture the appearance of a given face. The extraction has two phases. The first phase is composed of clustering and boosting upon a training set of faces with neutral expression, even illumination, and frontal pose. The second phase focuses on the extraction of the facial trait patterns from the set of faces with variations in expression, illumination, and poses. To apply the FTC to face recognition, two types of codewords, hard and probabilistic, with different metrics for characterizing the facial trait patterns are proposed. The hard codeword offers a concise representation of a face, while the probabilistic codeword enables matching with better accuracy. Our experiments compare the proposed FTC to other algorithms on several public datasets, all showing promising results.
- Published
- 2013
- Full Text
- View/download PDF
14. Exploring Manipulation Behavior on Video See-Through Head-Mounted Display with View Interpolation
- Author
-
Han-Lei Wang, Ping-Hsuan Han, Chun-Jui Lai, and Yi-Ping Hung
- Subjects
business.industry ,Computer science ,Virtual world ,media_common.quotation_subject ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Optical head-mounted display ,020207 software engineering ,02 engineering and technology ,Rendering algorithms ,Viewpoints ,Image (mathematics) ,Perception ,Computer graphics (images) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,Motion interpolation ,business ,ComputingMethodologies_COMPUTERGRAPHICS ,media_common ,Interpolation - Abstract
Video see-through HMD mixes the real and virtual world, and users can have a good experience on virtual part, but the real part captured by cameras still have some problem, especially the distance perception. In this paper, we try to remove the error due to the distance between cameras and users. We use depth image-based rendering algorithm to re-compute the true distance of the scene, and render the correct image to the user. And we use multiple cameras with different viewpoints to reduce the occlusion areas.
- Published
- 2017
- Full Text
- View/download PDF
15. RunPlay
- Author
-
Ting-Wei Chiu, Yi-Ping Lo, Zhi-Wei Yang, Shi-Yao Wei, Chen-Yu Wang, Hsing-Man Wang, and Yi-Ping Hung
- Subjects
020203 distributed computing ,030506 rehabilitation ,Focus (computing) ,business.industry ,Computer science ,Wearable computer ,02 engineering and technology ,Pressure sensor ,03 medical and health sciences ,Action (philosophy) ,Inertial measurement unit ,0202 electrical engineering, electronic engineering, information engineering ,Action recognition ,Computer vision ,Artificial intelligence ,0305 other medical science ,business ,Mobile device ,Avatar - Abstract
In this paper, we present an action recognition system which consists of pressure insoles, with 16 pressure sensors, and an inertial measurement unit. By analysing the data measured from these sensors, we are able to recognised several human activities. In this circumstance, we focus on the detection of jumping, squatting, moving left and right. We also designed a parkour game on a mobile device to demonstrate the in-game control of an avatar by human action.
- Published
- 2016
- Full Text
- View/download PDF
16. iKneeBraces
- Author
-
Jin-Jong Chen, Jui-Chun Hsiao, Shih-Yao Wei, Hsin-Ruey Tsai, Yi-Ping Lo, Yi-Ping Hung, Ting-Wei Chiu, and Chi-Feng Keng
- Subjects
Inertial frame of reference ,business.industry ,Computer science ,0206 medical engineering ,030229 sport sciences ,02 engineering and technology ,Osteoarthritis ,Thigh ,medicine.disease ,020601 biomedical engineering ,Gait ,03 medical and health sciences ,Adduction moment ,0302 clinical medicine ,medicine.anatomical_structure ,Center of pressure (terrestrial locomotion) ,medicine ,Computer vision ,Force platform ,Artificial intelligence ,business ,Simulation ,Motion sensors - Abstract
We propose light-weight wearable devices, iKneeBraces, to prevent knee osteoarthritis (OA) using knee adduction moment (KAM) evaluation. iKneeBrace consists of two inertial measurement units (IMUs) to measure shin and thigh angles. KAM is estimated by ground force reaction (GRF), knee position and center of pressure position. Instead of heavy and bulky 3DoF force plates conventionally used, we propose to build a 2D input regression model using shin and thigh angles from iKneeBrace as input to infer GRF direction and further estimate KAM. We perform an experiment to evaluate the method. The results show that iKneeBrace can infer KAM similar to the ground truth in the first peak, the most important part to prevent knee OA. Furthermore, the proposed method can infer KAM in all parts if better IMUs used in iKneeBrace in the future. The proposed method not only makes KAM evaluation portable but also requires only light-weight devices.
- Published
- 2016
- Full Text
- View/download PDF
17. MovingScreen
- Author
-
Yi-Ping Hung, Lee-Ting Huang, Chen-Hsin Hsieh, Hsin-Ruey Tsai, and Da-Yuan Huang
- Subjects
Computer science ,business.industry ,05 social sciences ,Word error rate ,02 engineering and technology ,Thumb ,medicine.anatomical_structure ,Display size ,Interaction method ,0202 electrical engineering, electronic engineering, information engineering ,medicine ,Calibration ,020201 artificial intelligence & image processing ,0501 psychology and cognitive sciences ,Computer vision ,Moving speed ,Artificial intelligence ,business ,Mobile device ,050107 human factors ,Gesture - Abstract
Smartphone screen size becomes larger nowadays. However, it causes users hard to reach a target with the thumb when using the smartphone in one hand. We propose an interaction method MovingScreen to solve the hard-to-reach problem by making the smartphone screen view movable. Users move the screen view to make the target into their comfort zone, an area users comfortably performing touch input with the thumb. They then select a target easily. Using the proposed triggering gesture bezel-scroll, MovingScreen automatically calibrates the comfort zone. Bezel-scroll detects the comfort zone and provides different screen moving ratios for users in different poses to hold the smartphone. Users are even allowed to adjust screen moving speed by altering bezel-scroll length. We evaluate performance of MovingScreen and other methods in a user study. The results show that MovingScreen has similar selection time (1030.58ms) but lower error rate (4.57%) to the other methods.
- Published
- 2016
- Full Text
- View/download PDF
18. Nail+
- Author
-
Yi-Ping Hung, Chiuan Wang, Jhe-Wei Lin, Da-Yuan Huang, Mike Y. Chen, De-Nian Yang, Min-Chieh Hsiu, and Yu-Chih Lin
- Subjects
Finger force ,Focus (computing) ,Nail deformation ,Computer science ,business.industry ,020207 software engineering ,02 engineering and technology ,Strain sensor ,Deformation (meteorology) ,0202 electrical engineering, electronic engineering, information engineering ,Nail (fastener) ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,business ,Strain gauge ,Gesture - Abstract
Force sensing has been widely used for bringing the touch from binary to multiple states, creating new abilities on surface interactions. However, prior proposed force sensing techniques mainly focus on enabling force-applied gestures on certain devices. This paper presents Nail+, a technique using fingernail deformation to enable force touch sensing interactions on everyday rigid surfaces. Our prototype, 3x3 0.2mm strain sensor array mounted on a fingernail, was implemented and conducted with a 12-participant study for evaluating the feasibility of this sensing approach. Result showed that the accuracy for sensing normal and force-applied tapping and swiping can achieve 84.67% on average. We finally proposed two example applications using Nail+ prototype for controlling the interfaces of head-mounted display (HMD) devices and remote screens.
- Published
- 2016
- Full Text
- View/download PDF
19. TouchRing
- Author
-
Jui-Chun Hsiao, Hsin-Ruey Tsai, Lee-Ting Huang, Yi-Ping Hung, Min-Chieh Hsiu, and Mike Y. Chen
- Subjects
Modality (human–computer interaction) ,InformationSystems_INFORMATIONINTERFACESANDPRESENTATION(e.g.,HCI) ,Computer science ,business.industry ,Capacitive sensing ,05 social sciences ,Multi-touch ,020207 software engineering ,02 engineering and technology ,Index finger ,Thumb ,Middle finger ,medicine.anatomical_structure ,Gesture recognition ,0202 electrical engineering, electronic engineering, information engineering ,medicine ,0501 psychology and cognitive sciences ,Computer vision ,Artificial intelligence ,business ,050107 human factors ,Gesture - Abstract
We propose a finger-worn touch device TouchRing to provide subtle and multi-touch input. TouchRing leverages printed electrodes and the capacitive sensing technique to detect touch input. It allows users to perform multi-touch gestures in one hand to increase input modality. TouchRing worn on the index finger allows multi-touch using the thumb and middle finger. Ten multi-touch gestures are designed in this paper. We also propose touch detection and gesture recognition approaches in TouchRing. Gesture Recognition accuracy is evaluated in the user study. Applications for TouchRing are also proposed to make controlling smart glasses more convenient.
- Published
- 2016
- Full Text
- View/download PDF
20. View interpolation for video see-through head-mounted display
- Author
-
Chun-Jui Lai, Yi-Ping Hung, and Ping-Hsuan Han
- Subjects
Computer science ,business.industry ,Interpolation (computer graphics) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Optical head-mounted display ,020207 software engineering ,02 engineering and technology ,Mixed reality ,Single camera ,Depth map ,Computer graphics (images) ,0202 electrical engineering, electronic engineering, information engineering ,Immersion (virtual reality) ,020201 artificial intelligence & image processing ,Computer vision ,Augmented reality ,Artificial intelligence ,business - Abstract
By using the head-mounted display (HMD), we can have an immersive virtual reality experience. But the user cannot see any information from the real world. To solve the problem, video seethrough HMD can acquire images from real environment, and present into the HMD, then, we could build a mixed reality (MR) or augmented reality (AR) system. However, how to append and calibrate cameras on HMD for recovering real environment is still a research issue. HTC VIVE has a single camera in front of its device. [Steptoe et al. 2014] and OVRVISION Pro proposed to append dual cameras to capture left and right images. Due to the difference of viewpoint, images captured by cameras are different to what human eyes see (figure 2). Although we could recover true 3D information with a depth map, there are still some occlusion areas that we cannot recover by single camera. Therefore, multiple cameras with different positions could complement each other for reducing occlusion areas. In this work, four configurations are simulated with a synthesized scene.
- Published
- 2016
- Full Text
- View/download PDF
21. Visual enhancement using sparsity-based image decomposition for low backlight displays
- Author
-
Soo-Chang Pei, Yi-Ping Hung, Zongqing Lu, and Chih-Tsung Shen
- Subjects
Brightness ,Engineering ,Boosting (machine learning) ,Liquid-crystal display ,Visual perception ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,Backlight ,020202 computer hardware & architecture ,law.invention ,Visualization ,law ,visual_art ,0202 electrical engineering, electronic engineering, information engineering ,visual_art.visual_art_medium ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,Layer (object-oriented design) ,business ,LED display - Abstract
We propose a power-constrained image enhancement system to maintain human visual perception when the LCD or LED display is under low backlight. Adopting the low backlight mode can save the electricity and lengthen the battery using time. First, we deduce the relationship between the image and the backlight for maintaining the same visual perceptual quality. Then, we propose a sparsity-based image decomposition to separate the intensity image into base layer and detail layer. Afterwards, we refer to the image-backlight relationship to compensate the base layer, while we also adopt texture-aw are boosting to enhance the detail layer. Experimental simulated results show that our system outperforms than the compared systems.
- Published
- 2016
- Full Text
- View/download PDF
22. The construction of a high-resolution visual monitoring for hazard analysis
- Author
-
Cheng-Wu Chen, Yi-Ping Hung, Wei-Ling Chiang, Wen Ko Hsu, and Chi Wei Lin
- Subjects
High rate ,Atmospheric Science ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Hazard analysis ,Dome (geology) ,Environmental monitoring ,Earth and Planetary Sciences (miscellaneous) ,Calibration ,Computer vision ,Artificial intelligence ,business ,Water Science and Technology ,Remote sensing ,Visual monitoring - Abstract
High rates of urbanization, environmental degradation, and industrial development in disaster-prone areas have all served to increase the extent of damage following catastrophes. In this paper, we present a novel framework for exploiting multi-resolution dual cameras, including a wide-angle camera and a speed dome camera, to construct a wide-angle, multi-layered, high-resolution visual monitoring system for hazard assessment. Our two-part camera system requires calibration of the correspondence between the detailed and overview image captured by the speed dome camera and wide-angle fixed camera, respectively. In-factory calibration is carried out using white–black patterns for speed dome turning and multi-layered calibration. The results are displayed as wide-angle, multi-layered, high-resolution images which are built up by the speed dome camera.
- Published
- 2012
- Full Text
- View/download PDF
23. Subject-Specific and Pose-Oriented Facial Features for Face Recognition Across Poses
- Author
-
Yi-Ping Hung, Yun-Wen Wang, Ping-Han Lee, and Gee-Sern Hsu
- Subjects
Biometry ,Face hallucination ,Computer science ,Feature extraction ,Facial recognition system ,Decision Support Techniques ,Pattern Recognition, Automated ,Artificial Intelligence ,Image Interpretation, Computer-Assisted ,Humans ,Three-dimensional face recognition ,Computer vision ,AdaBoost ,Electrical and Electronic Engineering ,Face detection ,Hidden Markov model ,Contextual image classification ,business.industry ,Pattern recognition ,General Medicine ,Computer Science Applications ,Facial Expression ,Human-Computer Interaction ,Control and Systems Engineering ,Face ,Artificial intelligence ,business ,Algorithms ,Software ,Information Systems - Abstract
Most face recognition scenarios assume that frontal faces or mug shots are available for enrollment to the database, faces of other poses are collected in the probe set. Given a face from the probe set, one needs to determine whether a match in the database exists. This is under the assumption that in forensic applications, most suspects have their mug shots available in the database, and face recognition aims at recognizing the suspects when their faces of various poses are captured by a surveillance camera. This paper considers a different scenario: given a face with multiple poses available, which may or may not include a mug shot, develop a method to recognize the face with poses different from those captured. That is, given two disjoint sets of poses of a face, one for enrollment and the other for recognition, this paper reports a method best for handling such cases. The proposed method includes feature extraction and classification. For feature extraction, we first cluster the poses of each subject's face in the enrollment set into a few pose classes and then decompose the appearance of the face in each pose class using Embedded Hidden Markov Model, which allows us to define a set of subject-specific and pose-priented (SSPO) facial components for each subject. For classification, an Adaboost weighting scheme is used to fuse the component classifiers with SSPO component features. The proposed method is proven to outperform other approaches, including a component-based classifier with local facial features cropped manually, in an extensive performance evaluation study.
- Published
- 2012
- Full Text
- View/download PDF
24. Viewpoint-Independent Object Detection Based on Two-Dimensional Contours and Three-Dimensional Sizes
- Author
-
Yen-Liang Lin, Cheng-Chih Tsai, Ping-Han Lee, Yi-Ping Hung, Chia-Hsiang Wu, and Shen-Chi Chen
- Subjects
Engineering ,Matching (graph theory) ,business.industry ,Mechanical Engineering ,Pedestrian detection ,Template matching ,Feature extraction ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Poison control ,Object (computer science) ,Object detection ,Computer Science Applications ,Automotive Engineering ,Computer vision ,Algorithm design ,Artificial intelligence ,business - Abstract
We propose a viewpoint-independent object-detection algorithm that detects objects in videos based on their 2-D and 3-D information. Object-specific quasi-3-D templates are proposed and applied to match objects' 2-D contours and to calculate their 3-D sizes. A quasi-3-D template is the contour and the 3-D bounding cube of an object viewed from a certain panning and tilting angle. Pedestrian templates amounting to 2660 and 1995 vehicle templates encompassing 19 tilting and 35 panning angles are used in this study. To detect objects, we first match the 2-D contours of object candidates with known objects' contours, and some object templates with large 2-D contour-matching scores are identified. In this step, we exploit some prior knowledge on the viewpoint on which the object is viewed to speed up the template matching, and the viewpoint likelihood for each contour-matched template is also assigned. Then, we calculate the 3-D widths, heights, and lengths of the contour-matched candidates, as well as the corresponding 3-D-size-matching scores. The overall matching score is obtained by combining the aforementioned likelihood and scores. The major contributions of this paper are to explore the joint use of 2-D and 3-D features in object detection. It shows that, by considering 2-D contours and 3-D sizes, one can achieve promising object detection rates. The proposed algorithms were evaluated on both pedestrian and vehicle sequences. It yielded significantly better detection results than the best results reported in PETS 2009, showing that our algorithm outperformed the state-of-the-art pedestrian-detection algorithms.
- Published
- 2011
- Full Text
- View/download PDF
25. Multi-Resolution Design for Large-Scale and High-Resolution Monitoring
- Author
-
Chih-Wei Lin, Mike Yen-Yang Chen, Yi-Ping Hung, Tzu-Hsuan Chiu, and Kuan-Wen Chen
- Subjects
Focus (computing) ,business.industry ,Computer science ,Real-time computing ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Frame rate ,Computer Science Applications ,law.invention ,Projector ,law ,Signal Processing ,Peripheral vision ,Human visual system model ,Media Technology ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,User interface ,business ,Image resolution - Abstract
Large-scale and high-resolution monitoring systems are ideal for many visual surveillance applications. However, existing approaches have insufficient resolution and low frame rate per second, or have high complexity and cost. We take inspiration from the human visual system and propose a multi-resolution design, e-Fovea, which provides peripheral vision with a steerable fovea that is in higher resolution. In this paper, we firstly present two user studies, with a total of 36 participants, to compare e-Fovea to two existing multi-resolution visual monitoring designs. The user study results show that for visual monitoring tasks, our e-Fovea design with steerable focus is significantly faster than existing approaches and preferred by users. We then present our design and implementation of e-Fovea, which combines both multi-resolution camera input and multi-resolution steerable projector output. Finally, we present our deployment of e-Fovea in three installations to demonstrate its feasibility.
- Published
- 2011
- Full Text
- View/download PDF
26. Editing by Viewing: Automatic Home Video Summarization by Viewing Behavior Analysis
- Author
-
Chia-Han Chang, Wen-Yan Chang, Wei-Jia Huang, Wei-Ting Peng, Yi-Ping Hung, Wei-Ta Chu, and Chien-Nan Chou
- Subjects
Facial expression ,Computer science ,business.industry ,Eye movement ,Sensor fusion ,Facial recognition system ,Automatic summarization ,Motion (physics) ,Computer Science Applications ,Visualization ,Human–computer interaction ,Signal Processing ,Media Technology ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,business - Abstract
In this paper, we propose the Interest Meter (IM), a system making the computer conscious of user's reactions to measure user's interest and thus use it to conduct video summarization. The IM takes account of users' spontaneous reactions when they view videos. To estimate user's viewing interest, quantitative interest measures are devised based on the perspectives of attention and emotion. For estimating attention states, variations of user's eye movement, blink, and head motion are considered. For estimating emotion states, facial expression is recognized as positive or neural emotion. By combining characteristics of attention and emotion by a fuzzy fusion scheme, we transform users' viewing behaviors into quantitative interest scores, determine interesting parts of videos, and finally concatenate them as video summaries. Experimental results show that the proposed concept “editing by viewing” works well and may provide a promising direction to consider the human factor in video summarization.
- Published
- 2011
- Full Text
- View/download PDF
27. Region-based image retrieval using color-size features of watershed regions
- Author
-
Yi-Ping Hung, Hsuan Yang, Greg C. Lee, and Cheng-Chieh Chiang
- Subjects
Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Pattern recognition ,Content-based image retrieval ,Automatic image annotation ,Image texture ,Feature (computer vision) ,Computer Science::Computer Vision and Pattern Recognition ,Signal Processing ,Media Technology ,Computer vision ,Computer Vision and Pattern Recognition ,Visual Word ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Image retrieval ,Earth mover's distance ,Feature detection (computer vision) - Abstract
This paper presents a region-based image retrieval system that provides a user interface for helping to specify the watershed regions of interest within a query image. We first propose a new type of visual features, called color-size feature, which includes color-size histogram and moments, to integrate color and region-size information of watershed regions. Next, we design a scheme of region filtering that is based on color-size histogram to fast screen out some of most irrelevant regions and images for the preprocessing of the image retrieval. Our region-based image retrieval system applies the Earth Mover's Distance in the design of the similarity measure for image ranking and matching. Finally, we present some experiments for the color-size feature, region filtering, and retrieval results that demonstrate the efficiency of our proposed system.
- Published
- 2009
- Full Text
- View/download PDF
28. Efficient hierarchical method for background subtraction
- Author
-
Yi-Ping Hung, Chu-Song Chen, Chun-Rong Huang, and Yu-Ting Chen
- Subjects
Background subtraction ,Pixel ,Hierarchy (mathematics) ,Computer science ,business.industry ,Pattern recognition ,Object detection ,Artificial Intelligence ,Component (UML) ,Histogram ,Signal Processing ,Segmentation ,Computer vision ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Software ,Block (data storage) - Abstract
Detecting moving objects by using an adaptive background model is a critical component for many vision-based applications. Most background models were maintained in pixel-based forms, while some approaches began to study block-based representations which are more robust to non-stationary backgrounds. In this paper, we propose a method that combines pixel-based and block-based approaches into a single framework. We show that efficient hierarchical backgrounds can be built by considering that these two approaches are complementary to each other. In addition, a novel descriptor is proposed for block-based background modeling in the coarse level of the hierarchy. Quantitative evaluations show that the proposed hierarchical method can provide better results than existing single-level approaches.
- Published
- 2007
- Full Text
- View/download PDF
29. 3D Printing and Camera Mapping
- Author
-
I-Chun Chen, He-Lin Luo, and Yi-Ping Hung
- Subjects
Sculpture ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Video art ,Projection mapping ,New media ,Virtual image ,Digital art ,Camera auto-calibration ,Computer graphics (images) ,Superimposition ,Computer vision ,Artificial intelligence ,business ,ComputingMethodologies_COMPUTERGRAPHICS - Abstract
Projection Mapping, the superimposing of virtual images upon actual objects, is already extensively used in performance arts. Applications of it are already quite mature, therefore, here we wish to achieve the opposite, or specifically speaking, the superimposing of actual objects into virtual images. This method of reverse superimposition is called "camera mapping." Through cameras, camera mapping captures actual objects, and introduces them into a virtual world. Then using superimposition, this allows for actual objects to be rendered as virtual objects. However, the actual objects here must have refined shapes so that they may be superimposed back into the camera. Through the proliferation of 3D printing, virtual 3D models in computers can be created in reality, thereby providing a framework for the limits and demands of "camera mapping." The new media artwork Digital Buddha combines 3D Printing and camera mapping. This work was created by 3-D deformable modeling through a computer, then transforming the model into a sculpture using 3D printing, and then remapping the materially produced sculpture back into the camera. Finally, it uses the already known algorithm to convert the model back into that of the original non-deformed sculpture. From this creation project, in the real world, audiences will see a deformed, abstract sculpture; and in the virtual world, through camera mapping, they will see a concrete sculpture (Buddha). In its representation, this piece of work pays homage to the work TV Buddha produced by video art master Nam June Paik. Using the influence television possesses over people, this work extends into the most important concepts of the digital era, "coding" and "decoding," simultaneously addressing the shock and insecurity people in the digital era feel toward images.
- Published
- 2015
- Full Text
- View/download PDF
30. An ensemble of invariant features for person re-identification
- Author
-
Jenq-Neng Hwang, Young-Gun Lee, Jang-Hee Yoo, Yi-Ping Hung, and Shen-Chi Chen
- Subjects
Ensemble forecasting ,Local binary patterns ,business.industry ,Feature extraction ,Pattern recognition ,Mixture model ,Convolutional neural network ,Robustness (computer science) ,Histogram ,Computer vision ,Artificial intelligence ,Invariant (mathematics) ,business ,Mathematics - Abstract
We propose an ensemble of invariant features for person re-identification. The proposed method requires no domain learning and can effectively overcome the issues created by the variations of human poses and viewpoint between a pair of different cameras. Our ensemble model utilizes both holistic and region-based features. To avoid the misalignment problem, the test human object sample is used to generate multiple virtual samples, by applying slight geometric distortion. The holistic features are extracted from a publically available pre-trained deep convolutional neural network. On the other hand, the region-based features are based on our proposed Two-Way Gaussian Mixture Model Fitting and the Completed Local Binary Pattern texture representations. To make better generalization during the matching without additional learning processes for the feature aggregation, the ensemble scheme combines all three feature distances using distances normalization. The proposed framework achieves robustness against partial occlusion, pose and viewpoint changes. In addition, the experimental results show that our method exceeds the state of the art person re-identification performance based on the challenging benchmark 3DPeS.
- Published
- 2015
- Full Text
- View/download PDF
31. Freehand Push-Gesture Recognition via 3D Palm Trajectory Modeling
- Author
-
Chuen-Kai Shie, Yi-Ping Hung, Chu-Song Chen, and Shih-Yao Lin
- Subjects
Engineering ,business.industry ,Gesture recognition ,Speech recognition ,Pattern recognition (psychology) ,Trajectory ,Selection (linguistics) ,Computer vision ,Artificial intelligence ,business ,Gesture - Abstract
This paper aims at improving the recognition of 3D push-hand gesture, which can trigger a target selection command with our hands in the air. Although general 3D push-gesture recognizers have been developed and widely used for this purpose, a severe weakness of the current push-recognizers is that they are instable to askew-pushing problems that happen frequently in practice. It is because that the push trajectory of our hand is not always a straightforward movement due to the anatomy of human, but would vary depending on the location of the target relative to the users. We explore the 3D palm trajectories of push-gestures in different locations around the user, and propose a 3D push-gesture modeling approach by learning 3D palm trajectories to solve the askew-click problem. We evaluate the proposed recognizers on a click-gesture dataset, and compare it with the prior arts of forward-push recognizers. Experimental results demonstrate that our approach achieves higher recognition accuracies than existing approaches.
- Published
- 2015
- Full Text
- View/download PDF
32. Location-aware object detection via coherent region grouping
- Author
-
Yi-Ping Hung, Shen-Chi Chen, Chu-Song Chen, and Kevin Lin
- Subjects
Similarity (geometry) ,business.industry ,Computer science ,Feature extraction ,Pattern recognition ,Coherence (statistics) ,Object detection ,Object-class detection ,Discriminative model ,Computer vision ,Viola–Jones object detection framework ,Artificial intelligence ,business ,Adaptation (computer science) - Abstract
We present a scene adaptation algorithm for object detection. Our method discovers scene-dependent features discriminative to classifying foreground objects into different categories. Unlike previous works suffering from insufficient training data collected online, our approach incorporated with a similarity grouping procedure can automatically gather more consistent training examples from a neighbour area. Experimental results show that the proposed method outperforms several related works with higher detection accuracies.
- Published
- 2015
- Full Text
- View/download PDF
33. Comparison between immersion-based and toboggan-based watershed image segmentation
- Author
-
Yi-Ping Hung, Yung-Chieh Lin, Zen-Chung Shih, and Yu-Pao Tsai
- Subjects
Watershed ,business.industry ,Computer science ,Information Storage and Retrieval ,Signal Processing, Computer-Assisted ,Image processing ,Image segmentation ,Image Enhancement ,Computer Graphics and Computer-Aided Design ,Pattern Recognition, Automated ,Imaging, Three-Dimensional ,Image Interpretation, Computer-Assisted ,Immersion (virtual reality) ,Segmentation ,Computer vision ,Artificial intelligence ,business ,Algorithms ,Software - Abstract
Watershed segmentation has recently become a popular tool for image segmentation. There are two approaches to implementing watershed segmentation: immersion approach and toboggan simulation. Conceptually, the immersion approach can be viewed as an approach that starts from low altitude to high altitude and the toboggan approach as an approach that starts from high altitude to low altitude. The former seemed to be more popular recently (e.g., Vincent and Soille), but the latter had its own supporters (e.g., Mortensen and Barrett). It was not clear whether the two approaches could lead to exactly the same segmentation result and which approach was more efficient. In this paper, we present two "order-invariant" algorithms for watershed segmentation, one based on the immersion approach and the other on the toboggan approach. By introducing a special RIDGE label to achieve the property of order-invariance, we find that the two conceptually opposite approaches can indeed obtain the same segmentation result. When running on a Pentium-III PC, both of our algorithms require only less than 1/30 s for a 256 /spl times/ 256 image and 1/5 s for a 512 /spl times/ 512 image, on average. What is more surprising is that the toboggan algorithm, which is less well known in the computer vision community, turns out to run faster than the immersion algorithm for almost all the test images we have used, especially when the image is large, say, 512 /spl times/ 512 or larger. This paper also gives some explanation as to why the toboggan algorithm can be more efficient in most cases.
- Published
- 2006
- Full Text
- View/download PDF
34. Driver Assistance System Providing an Intuitive Perspective View of Vehicle Surrounding
- Author
-
Yi-Ping Hung, Chun Kang Peng, Yen Ting Yeh, Kuan-Wen Chen, and Yong-Sheng Chen
- Subjects
Brightness ,Paraboloid ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Advanced driver assistance systems ,Rendering (computer graphics) ,Fisheye lens ,Computer vision ,Artificial intelligence ,business ,Sensory cue ,Simulation ,Camera resectioning ,Ground plane - Abstract
Driver assistance systems can help drivers to avoid car accidents by providing warning signals or visual cues of surrounding situations. Instead of the fixed bird’s-eye view monitoring proposed in many previous works, we developed a real-time vehicle surrounding monitoring system that can assist drivers to perceive the vehicle surrounding situations in third-person viewpoints. Four fisheye cameras were mounted around the vehicle in our system. We developed a simple and accurate fisheye camera calibration method to dewarp the captured images into perspective projection ones. Next, we estimated the intrinsic parameters of each undistorted virtual camera by using planar calibration patterns and then obtain the extrinsic camera parameters by using the global patterns on a ground plane. A new method was proposed to tackle the brightness uniformity problem caused by the various lighting conditions of cameras. Finally, we projected the undistorted images onto a 3D hybrid projection model, stitched these images together, and then rendered the images from a third-person viewpoint selected by the driver. The proposed hybrid projection model is composed of a paraboloid model and a columnar model and can achieve rendering results with less distortion. Compared to conventional around-vehicle monitoring systems, our system can provide adaptive, integrated, and intuitive views of vehicle surroundings in a more realistic way.
- Published
- 2015
- Full Text
- View/download PDF
35. Estimation of 3-D Foot Parameters Using Hand-Held RGB-D Camera
- Author
-
Yang-Sheng Chen, Sheng-Wen Shih, Yi-Ping Hung, Peng-Yuan Kao, and Yu-Chun Chen
- Subjects
Set (abstract data type) ,Computer science ,business.industry ,Coordinate system ,Hand held ,Foot width ,Point cloud ,RGB color model ,Computer vision ,Artificial intelligence ,business ,Pose ,Foot (unit) - Abstract
Most people choose shoes mainly based on their foot sizes. However, a foot size only reflects the foot length which does not consider the foot width. Therefore, some people use both width and length of their feet to select shoes, but those two parameters cannot fully characterize the 3-D shape of a foot and are certainly not enough for selecting a pair of comfortable shoes. In general, the ball-girth is also required for shoe selection in addition to the width and the length of a foot. In this paper, we propose a foot measurement system which consists of a low cost Intel Creative Senz3D RGB-D camera, an A4-size reference pattern, and a desktop computer. The reference pattern is used to provide video-rate camera pose estimation. Therefore, the acquired 3-D data can be converted into a common reference coordinate system to form a set of complete foot surface data. Also, we proposed a markerless ball-girth estimation method which uses the lengthes of two toes gaps to infer the joint locations of the big/little toes and the metatarsals. Results from real experiments show that the proposed method is accurate enough to provide three major foot parameters for shoe selection.
- Published
- 2015
- Full Text
- View/download PDF
36. A New Approach to Automatic Reconstruction of a 3-D World Using Active Stereo Vision
- Author
-
Sheng-Wen Shih, Yi-Ping Hung, Chung-Yi Lin, and Gregory Y. Tang
- Subjects
Bayes estimator ,Random field ,Markov chain ,business.industry ,Bayesian probability ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Rendering (computer graphics) ,Octree ,Stereopsis ,Signal Processing ,Computer vision ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Active vision ,business ,Software ,ComputingMethodologies_COMPUTERGRAPHICS ,Mathematics - Abstract
In this paper, we propose a new automatic approach to reconstructing 3-D environments using an active binocular head. To efficiently store and access the depth estimates, we propose the use of an inverse polar octree which can transform both unbounded depth estimates and unbounded estimation errors into a bounded 3-D space with appropriate resolution. The depth estimates are computed by using the asymptotic Bayesian estimation method. Estimated depth values are then smoothed by using discontinuity-preserving Markov random fields. The path of the local motion required by the asymptotic Bayesian method is determined online automatically to reduce the ambiguity of stereo matching. Rules for checking the consistency between the new observation and the previous observations have been developed to properly update the inverse polar octree. Experimental results showed that the proposed approach is very promising for automatic generation of 3-D models which can be used for rendering a 3-D scene in a virtual reality system.
- Published
- 2002
- Full Text
- View/download PDF
37. Augmenting panoramas with object movies by generating novel views with disparity-based view morphing
- Author
-
Szu-Wei Lin, Yi-Ping Hung, Chu-Song Chen, and Yu-Pao Tsai
- Subjects
Panorama ,Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Object (computer science) ,Image-based modeling and rendering ,Computer Graphics and Computer-Aided Design ,Image (mathematics) ,Morphing ,Computer graphics (images) ,Augmented reality ,Computer vision ,Artificial intelligence ,Set (psychology) ,business ,Software ,Interior design - Abstract
Our goal is to augment a panorama with object movies in a visually 3D-consistent way. Notice that a panorama is recorded as one single 2D image and an object movie (OM) is composed of a set of 2D images taken around a 3D object. The challenge is how to integrate the above two sources of 2D images in a 3D-consistent way so that the user can easily manipulate object movies in a panorama. To solve this problem, we adopt a purely image-based approach that does not have to reconstruct the geometric models of the 3D objects to be inserted in the panorama. A critical issue of this method is how to generate the novel views required for showing an OM in different places of a panorama, and we have proposed a view morphing technique, called t-DBVM, to solve this problem. Our experiments have shown that this purely image-based approach can effectively generate visually convincing OM-augmented panoramas. This method has great potential for many applications that require integration of panoramas and object movies, such as virtual malls, virtual museum, and interior design. Copyright © 2002 John Wiley & Sons, Ltd.
- Published
- 2002
- Full Text
- View/download PDF
38. Object Detection for Neighbor Map Construction in an IoV System
- Author
-
Ming-Hsuan Yang, Yi-Ping Hung, Shen-Chi Chen, Chu-Song Chen, Kevin Lin, and Kuan-Wen Chen
- Subjects
Computer science ,business.industry ,Pedestrian detection ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Grid ,Odometer ,Object detection ,Inertial measurement unit ,Global Positioning System ,Computer vision ,Artificial intelligence ,business ,Focus (optics) ,Intelligent transportation system - Abstract
Many applications of machine-to-machine (M2M) based intelligent transportation systems highly rely on the accurate estimation of neighbor map, where neighbor map mentions the locations of all nearby vehicles and pedestrians. To build the neighbor map, it usually integrates multiple sensors, such as GPS, odometer, inertial measurement unit (IMU), laser scanners, cameras, and RGB-D cameras. In this paper, we build a M2M framework to estimate the neighbor map and focus on the improvement of vehicle and pedestrian detection of most popular sensors, camera. We propose a novel grid-based object detection approach and deal with cameras on both roadside units and vehicles. It adapts to the environments and achieves high accuracy, and can be used to improve the performance of neighbor map estimation.
- Published
- 2014
- Full Text
- View/download PDF
39. 3-D Gaze Tracking Using Pupil Contour Features
- Author
-
Hsin-Ruey Tsai, Yi-Ping Hung, Chih-Chuan Lai, and Sheng-Wen Shih
- Subjects
business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Tracking system ,Curvature ,Tracking (particle physics) ,Refraction ,Gaze ,Pupil ,Feature (computer vision) ,Eye tracking ,Computer vision ,Artificial intelligence ,business - Abstract
Glint features have important roles in gaze tracking systems. But when the operation range of a gaze tracking system is enlarged, the performance of glint-feature-based (GFB) approaches will be degraded mainly due to the curvature variation problem at around the edge of the cornea. Although the pupil contour feature may provide complementary information to help estimating the eye gaze, existing methods do not properly handle the cornea refraction problem, leading to inaccurate results. This paper describes a contour-feature-based (CFB) 3-D gaze tracking method that is compatible to cornea refraction. Experiments show the effectiveness of the proposed approach for gaze tracking.
- Published
- 2014
- Full Text
- View/download PDF
40. Appearance-Based Gaze Tracking with Free Head Movement
- Author
-
Yi-Ping Hung, Kuan-Wen Chen, Sheng-Wen Shih, Shen-Chi Chen, Yu-Ting Chen, and Chih-Chuan Lai
- Subjects
business.industry ,Head (linguistics) ,Orientation (computer vision) ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Tracking system ,Computer vision ,Artificial intelligence ,Tracking (particle physics) ,business ,Set (psychology) ,Gaze - Abstract
In this work, we develop an appearance-based gaze tracking system allowing user to move their head freely. The main difficulty of the appearance-based gaze tracking method is that the eye appearance is sensitive to head orientation. To overcome the difficulty, we propose a 3-D gaze tracking method combining head pose tracking and appearance-based gaze estimation. We use a random forest approach to model the neighbor structure of the joint head pose and eye appearance space, and efficiently select neighbors from the collected high dimensional data set. Li-optimization is then used to seek for the best solution for regression from the selected neighboring samples. Experiment results shows that it can provide robust binocular gaze tracking results with less constraints but still provides moderate estimation accuracy of gaze estimation.
- Published
- 2014
- Full Text
- View/download PDF
41. A spatiotemporal background extractor using a single-layer codebook model
- Author
-
Wei-Jie Liao, Chih-Wei Lin, Chu-Song Chen, and Yi-Ping Hung
- Subjects
Background subtraction ,business.industry ,Computer science ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Codebook ,Extractor ,Wallflower ,Component (UML) ,Computer vision ,Artificial intelligence ,Noise (video) ,business ,Completeness (statistics) ,Spatial analysis - Abstract
Background subtraction is a crucial component in visual surveillance, which has been studied over years. However, an efficient algorithm that can tolerate the environment changes such as dynamic backgrounds and sudden changes of illumination is still demanding. In this paper, we design an innovative framework called the spatiotemporal background extractor (SBE) from a single-layer codebook model. Two main extractors, the background extractor (BE) and the background gradient extractor (BGE), are constructed to extract the foreground objects. The background extractor is built for each single frame with spatial information propagated from the neighbor locations, which is useful for handling dynamic background and sudden lighting changes. The background gradient extractor is also constructed and updated, and we design a propagation forbidden policy for background updating, so as to keep the completeness of foreground shape via the background gradient information. The proposed method can efficiently capture the foreground and eliminates the noise of background. The performance of the proposed method is compared with MoG [3], Codebook [4] and ViBe [8] on the Wallflower [1] and Perception [2] datasets.
- Published
- 2014
- Full Text
- View/download PDF
42. Left-Luggage Detection from Finite-State-Machine Analysis in Static-Camera Videos
- Author
-
Yi-Ping Hung, Kevin Lin, Chu-Song Chen, Shen-Chi Chen, and Daw-Tung Lin
- Subjects
Background subtraction ,Finite-state machine ,Pixel ,Computer science ,Event (computing) ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Object detection ,Object-class detection ,Computer Science::Computer Vision and Pattern Recognition ,Video tracking ,Computer vision ,Artificial intelligence ,business - Abstract
We present an abandoned object detection system in this paper. A finite-state-machine model is introduced to extract stationary foregrounds in a scene for visual surveillance, where the state value of each pixel is inferred via the cooperation of short-term and long-term background models constructed in the proposed approach. To identify the left-luggage event, we then verify whether the static foregrounds are abandoned objects through the analysis of owner's moving trajectory back-tracked to the static foreground locations. Experimental results reveal that the proposed approach tackles the problem well on publicly available datasets.
- Published
- 2014
- Full Text
- View/download PDF
43. A sleep monitoring system based on audio, video and depth information for detecting sleep events
- Author
-
Yi-Ping Hung, Kuan-Wen Chen, and Lyn Chao-ling Chen
- Subjects
Audio signal ,medicine.diagnostic_test ,business.industry ,Computer science ,Reliability (computer networking) ,Sleep apnea ,Electroencephalography ,medicine.disease ,computer.software_genre ,medicine ,Computer vision ,Artificial intelligence ,Noise (video) ,Sleep (system call) ,business ,Audio signal processing ,computer - Abstract
The purpose of this study is to develop a non-invasive sleep monitoring system to distinguish sleep disturbances based on multiple sensors. Unlike clinical sleep monitoring which records biological information such as EEG, EOG, and EMG, in this study, we aim to identify occurrences of events from a sleep environment. A device with an infrared depth sensor, a RGB camera, and a four-microphone array is used to detect three types of events: motion events, lighting events, and sound events. Given streams of depth signals and color images, we build two background models to detect movements and lighting effects, and audio signals are scored simultaneously. Moreover, we classify events by an epoch approach algorithm and provide a graphical sleep diagram for browsing corresponding video clips. Experimental results in sleep condition show the efficiency and reliability of our system, and it is convenient and cost-effective to be used in home context.
- Published
- 2014
- Full Text
- View/download PDF
44. Simple and efficient method of calibrating a motorized zoom lens
- Author
-
Sheng-Wen Shih, Yong-Sheng Chen, Chiou-Shann Fuh, and Yi-Ping Hung
- Subjects
Zoom lens ,Digital zoom ,Computer science ,Aperture ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Astrophysics::Instrumentation and Methods for Astrophysics ,Physics::Optics ,GeneralLiterature_MISCELLANEOUS ,law.invention ,Lens (optics) ,View camera ,law ,Camera auto-calibration ,Computer Science::Computer Vision and Pattern Recognition ,Computer graphics (images) ,Signal Processing ,Computer vision ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Zoom ,business ,Camera resectioning - Abstract
In this work, three servo motors are used to independently control the aperture, zoom, and focus of our zoom lens. Our goal is to calibrate, efficiently, the camera parameters for all the possible configurations of lens settings. We use a calibration object suitable for zoom lens calibration to deal with the defocusing problem. Instead of calibrating the zoom lens with respect to the three lens settings simultaneously, we perform the monofocal camera calibration, adaptively, over the ranges of the zoom and focus settings while fixing the aperture setting at a preset value. Bilinear interpolation is used to provide the values of the camera parameters for those lens settings where no observations are taken. The adaptive strategy requires the monofocal camera calibration only for the lens settings where the interpolated camera parameters are not accurate enough, and is hence referred to as the calibration-on-demand method. Our experiments show that the proposed calibration-on-demand method can provide accurate camera parameters for all the lens settings of a motorized zoom lens, even though the camera calibration is performed only for a few sampled lens settings.
- Published
- 2001
- Full Text
- View/download PDF
45. Three-dimensional ego-motion estimation from motion fields observed with multiple cameras
- Author
-
Chiou-Shann Fuh, Lin Gwo Liou, Yi-Ping Hung, and Yong-Sheng Chen
- Subjects
business.industry ,media_common.quotation_subject ,Optical flow ,Ego motion estimation ,Field of view ,Observer (special relativity) ,Ambiguity ,Residual ,Motion field ,Artificial Intelligence ,Motion estimation ,Signal Processing ,Computer vision ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Software ,Mathematics ,media_common - Abstract
In this paper, we present a robust method to estimate the three-dimensional ego-motion of an observer moving in a static environment. This method combines the optical flow fields observed with multiple cameras to avoid the ambiguity of 3-D motion recovery due to small field of view and small depth variation in the field of view. Two residual functions are proposed to estimate the ego-motion for different situations. In the non-degenerate case, both the direction and the scale of the three-dimensional rotation and translation can be obtained. In the degenerate case, rotation can still be obtained but translation can only be obtained up to a scale factor. Both the number of cameras and the camera placement affect the accuracy of the estimated ego-motion. We compare different camera configurations through simulation. Some results of real-world experiments are also given to demonstrate the benefits of our method.
- Published
- 2001
- Full Text
- View/download PDF
46. AUTOMATIC DETECTION AND TRACKING OF HUMAN HEADS USING AN ACTIVE STEREO VISION SYSTEM
- Author
-
Yi-Ping Hung, Cheng-Yuan Tang, and Zen Chen
- Subjects
Stereo cameras ,Human head ,business.industry ,Computer science ,Machine vision ,Template matching ,Detector ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Stereopsis ,Artificial Intelligence ,Computer vision ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Active vision ,Face detection ,Software - Abstract
A new head tracking algorithm for automatically detecting and tracking human heads in complex backgrounds is proposed. By using an elliptical model for the human head, our Maximum Likelihood (ML) head detector can reliably locate human heads in images having complex backgrounds and is relatively insensitive to illumination and rotation of the human heads. Our head detector consists of two channels: the horizontal and the vertical channels. Each channel is implemented by multiscale template matching. Using a hierarchical structure in implementing our head detector, the execution time for detecting the human heads in a 512×512 image is about 0.02 second in a Sparc 20 workstation (not including the time for image acquisition). Based on the ellipse-based ML head detector, we have developed a head tracking method that can monitor the entrance of a person, detect and track the person's head, and then control the stereo cameras to focus their gaze on this person's head. In this method, the ML head detector and the mutually-supported constraint are used to extract the corresponding ellipses in a stereo image pair. To implement a practical and reliable face detection and tracking system, further verification using facial features, such as eyes, mouth and nostrils, may be essential. The 3D position computed from the centers of the two corresponding ellipses is then used for fixation. An active stereo head has been used to perform the experiments and has demonstrated that the proposed approach is feasible and promising for practical uses.
- Published
- 2000
- Full Text
- View/download PDF
47. Theoretical aspects of vertically invariant gray-level morphological operators and their application on adaptive signal and image filtering
- Author
-
Yi-Ping Hung, Ja-Ling Wu, and Chu-Song Chen
- Subjects
Signal processing ,business.industry ,Feature extraction ,Image processing ,Mathematical morphology ,Structuring ,Adaptive filter ,Signal Processing ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,Invariant (mathematics) ,business ,Algorithm ,Smoothing ,Mathematics - Abstract
We use vertically invariant morphological filters for time-varying or adaptive signal processing. The morphological filters adopted in this paper are vertically invariant openings and closings. Vertically invariant openings and closings have intuitive geometric interpretations and can provide different filtering scales with respect to different spatial positions. Hence, they are suitable for adaptive signal filtering. To adaptively assign structuring elements of the vertically invariant openings or closings, we develop the progressive umbra-filling (PUF) procedure. Experimental results have shown that our approach can successfully eliminate noise without oversmoothing the important features of a signal.
- Published
- 1999
- Full Text
- View/download PDF
48. Generation of multiviewpoint video from stereoscopic video
- Author
-
Ching-Che Kao, Ho-Chao Huang, and Yi-Ping Hung
- Subjects
Motion compensation ,Video post-processing ,Stereo cameras ,Video capture ,Computer science ,business.industry ,Epipolar geometry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Image processing ,Video processing ,Video compression picture types ,Uncompressed video ,Computer graphics (images) ,Video tracking ,Media Technology ,Computer vision ,Video denoising ,Artificial intelligence ,Electrical and Electronic Engineering ,Multiview Video Coding ,business ,Block-matching algorithm ,Interpolation - Abstract
This paper proposes a new technique for generating multiviewpoint video from a two-view stereo video sequence. The two-view stereo video can be easily obtained by using inexpensive two-view stereo video capture devices. For each image pair in the two-view stereoscopic video signals, our system first estimates the corresponding points of each pixel based on an epipolar constraint. A smoothing algorithm is used to smooth the estimation result, and then generate the disparity maps for each image pair. Then the proposed system can generate multiple perspective stereo video by interpolating or extrapolating the original views based on the generated disparity maps. Compared to the traditional method of capturing the multiple perspective video directly, our method can significantly reduce both the difficulties of video capturing and processing as well as the amount of video data.
- Published
- 1999
- Full Text
- View/download PDF
49. Calibration of an active binocular head
- Author
-
Wei-Song Lin, Sheng-Wen Shih, and Yi-Ping Hung
- Subjects
Pixel ,business.industry ,Epipolar geometry ,Computer Science Applications ,Human-Computer Interaction ,Control and Systems Engineering ,Camera auto-calibration ,Video tracking ,Focal length ,Computer vision ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,Active vision ,Focus (optics) ,Software ,Camera resectioning ,Mathematics - Abstract
In this paper, we show how an active binocular head, the IIS head, can be easily calibrated with very high accuracy. Our calibration method can also be applied to many other binocular heads. In addition to the proposal and demonstration of a four-stage calibration process, there are three major contributions in this paper. First, we propose a motorized-focus lens (MFL) camera model which assumes constant nominal extrinsic parameters. The advantage of having constant extrinsic parameters is to having a simple head/eye relation. Second, a calibration method for the MFL camera model is proposed in this paper, which separates estimation of the image center and effective focal length from estimation of the camera orientation and position. This separation has been proved to be crucial; otherwise, estimates of camera parameters would be very noise-sensitive. Thirdly, we show that, once the parameters of the MFL camera model is calibrated, a nonlinear recursive least-square estimator can be used to refine all the 35 kinematic parameters. Real experiments have shown that the proposed method can achieve accuracy of one pixel prediction error and 0.2 pixel epipolar error, even when all the joints, including the left and right focus motors, are moved simultaneously. This accuracy is good enough for many 3D vision applications, such as navigation, object tracking and reconstruction.
- Published
- 1998
- Full Text
- View/download PDF
50. Panoramic Stereo Imaging System with Automatic Disparity Warping and Seaming
- Author
-
Ho-Chao Huang and Yi-Ping Hung
- Subjects
Computer science ,business.industry ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Tripod (photography) ,Robotics ,Image processing ,Virtual reality ,Computer Graphics and Computer-Aided Design ,Stereo imaging ,Modeling and Simulation ,Computer graphics (images) ,Systems architecture ,Computer vision ,Geometry and Topology ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Image warping ,business ,Stereo camera - Abstract
Two commonly used approaches for building a virtual reality (VR) world are the model-based approach and the image-based approach. Recently, the image-based approach has received much attention for its advantages of being easier to build a VR model and of being able to provide photo-realistic views. However, traditional image-based VR systems cannot produce the stereo views that can give the users the feeling of 3D depth. In this paper, we present a panoramic stereo imaging (PSI) system which can produce stereo panoramas for image-based VR systems. This PSI system is referred to as the PSI-II system, which is an improved system of our previous experimental PSI-I system. The PSI-I system uses a well-calibrated tripod system to acquire a series of stereo image pairs, while the PSI-II system does not require the use of a well-calibrated tripod system and can automatically generate a stereo-pair of panoramic images by using a novel disparity warping technique and a hierarchical seaming algorithm. Our PSI-II system can automatically correct the epipolar-line inconsistency of the stereo images pairs and the image disparity caused by the dislocation of the camera's lens center in the image acquisition process. Our experiments have shown that the proposed method can easily provide realistic 360° panoramic views for image-based VR systems.
- Published
- 1998
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.