Books > Computing & IT > Applications of computing > Artificial intelligence > Computer vision
|
Buy Now
Vision-Language Pre-Training - Basics, Recent Advances, and Future Trends (Paperback)
Loot Price: R2,230
Discovery Miles 22 300
|
|
Vision-Language Pre-Training - Basics, Recent Advances, and Future Trends (Paperback)
Series: Foundations and Trends (R) in Computer Graphics and Vision
Expected to ship within 10 - 15 working days
|
Donate to Against Period Poverty
Total price: R2,240
Discovery Miles: 22 400
|
Humans perceive the world through many channels, such as images
viewed by the eyes, or voices heard by the ears. Though any
individual channel might be incomplete or noisy, humans can
naturally align and fuse information collected from multiple
channels in order to grasp the key concepts needed for a better
understanding of the world. One of the core aspirations in
Artificial Intelligence (AI) is to develop algorithms that endow
computers with an ability to effectively learn from multimodal (or,
multi-channel) data. This data is similar to sights and sounds
attained from vision and language that help humans make sense of
the world around us. For example, computers could mimic this
ability by searching the most relevant images to a text query (or
vice versa), and by describing the content of an image using
natural language. Vision-and-Language (VL), a popular research area
that sits at the nexus of Computer Vision and Natural Language
Processing (NLP), aims to achieve this goal. This monograph surveys
vision-language pre-training (VLP) methods for multimodal
intelligence that have been developed in the last few years.
Approaches are grouped into three categories: (i) VLP for
image-text tasks, such as image captioning, image-text retrieval,
visual question answering, and visual grounding; (ii) VLP for core
computer vision tasks, such as (open-set) image classification,
object detection, and segmentation; and (iii) VLP for video-text
tasks, such as video captioning, video-text retrieval, and video
question answering. For each category, a comprehensive review of
state-of-the-art methods is presented, and the progress that has
been made and challenges still being faced are discussed, using
specific systems and models as case studies. In addition, for each
category, advanced topics being actively explored in the research
community are presented, such as big foundation models, unified
modeling, in-context few-shot learning, knowledge, robustness, and
computer vision in the wild, to name a few.
General
Is the information for this product incomplete, wrong or inappropriate?
Let us know about it.
Does this product have an incorrect or missing image?
Send us a new image.
Is this product missing categories?
Add more categories.
Review This Product
No reviews yet - be the first to create one!
|
|
Email address subscribed successfully.
A activation email has been sent to you.
Please click the link in that email to activate your subscription.