U.S. patent application number 13/417375 was filed with the patent office on 2013-04-04 for electronic device and method for controlling electronic device.
This patent application is currently assigned to LG ELECTRONICS INC.. The applicant listed for this patent is Sungmin BAEK, Joomin KIM, Kyunghwan KIM, Hanyoung KO. Invention is credited to Sungmin BAEK, Joomin KIM, Kyunghwan KIM, Hanyoung KO.
Application Number | 20130083151 13/417375 |
Document ID | / |
Family ID | 47992203 |
Filed Date | 2013-04-04 |
United States Patent
Application |
20130083151 |
Kind Code |
A1 |
KIM; Kyunghwan ; et
al. |
April 4, 2013 |
ELECTRONIC DEVICE AND METHOD FOR CONTROLLING ELECTRONIC DEVICE
Abstract
An electronic device is provided that includes a memory, a
communication unit configured to receive a video image streamed
from a first electronic device, a display unit configured to
display the video image, and a control unit configured to identify
a specific area included in the video image, to store a first image
displayed on the specific area at a first time point in the memory,
and to store in the memory a second image displayed on the specific
area at a second time point when a variation of an image is equal
to or more than a predetermined threshold.
Inventors: |
KIM; Kyunghwan; (Seoul,
KR) ; KO; Hanyoung; (Seoul, KR) ; KIM;
Joomin; (Seoul, KR) ; BAEK; Sungmin; (Seoul,
KR) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
KIM; Kyunghwan
KO; Hanyoung
KIM; Joomin
BAEK; Sungmin |
Seoul
Seoul
Seoul
Seoul |
|
KR
KR
KR
KR |
|
|
Assignee: |
LG ELECTRONICS INC.
Seoul
KR
|
Family ID: |
47992203 |
Appl. No.: |
13/417375 |
Filed: |
March 12, 2012 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61541289 |
Sep 30, 2011 |
|
|
|
Current U.S.
Class: |
348/14.07 ;
348/E7.083 |
Current CPC
Class: |
H04N 7/147 20130101;
H04L 65/1069 20130101; H04L 65/4053 20130101; H04N 7/15
20130101 |
Class at
Publication: |
348/14.07 ;
348/E07.083 |
International
Class: |
H04N 7/15 20060101
H04N007/15 |
Claims
1. An electronic device comprising: a memory; a communication unit
configured to receive a video image streamed from a first
electronic device; a display unit configured to display the video
image; and a control unit configured to identify a specific area
included in the video image, to store a first image displayed on
the specific area at a first time point in the memory, and to store
in the memory a second image displayed on the specific area at a
second time point when a variation of an image displayed in the
specific area is equal to or more than a predetermined
threshold.
2. The electronic device of claim 1, wherein the control unit is
configured to store the first image so that the first image
corresponds to the first time point and to store the second image
so that the second image corresponds to the second time point.
3. The electronic device of claim 1, wherein the first and second
images are still images.
4. The electronic device of claim 1, wherein the control unit is
configured to determine the variation of the image displayed on the
specific area based on a variation between the image displayed on
the specific area and the first image stored in the memory.
5. The electronic device of claim 1, wherein the control unit is
configured to analyze the video image to identify the specific
area.
6. The electronic device of claim 1, wherein the control unit is
configured to receive information on the specific area from the
first electronic device through the communication unit and to
identify the specific area based on the received information.
7. An electronic device comprising: a memory; a communication unit
configured to receive a video image streamed from a first
electronic device; a display unit configured to display the video
image; and a control unit configured to identify a specific area
included in the video image, to store in the memory a still image
reflecting a content displayed on the specific area whenever the
content changes, so that the still image corresponds to a time
point when the content changes, to determine a time point
corresponding to a predetermined request when the request is
received, and to call a still image corresponding to the determined
time point from the memory.
8. The electronic device of claim 7, wherein the control unit is
configured to display both the streamed video image and the called
still image on the display unit.
9. The electronic device of claim 8, wherein the control unit is
configured to display the still image on a second area of the video
image, the second area not overlapping the specific area.
10. The electronic device of claim 7, wherein the controller is
configured to replace an image displayed on the specific area of
the streamed video image by the still image and to display the
replaced still image on the display unit.
11. An electronic device comprising: a memory; a communication unit
configured to receive at least one multimedia data clip from at
least one second electronic device; a display unit configured to
display the at least one multimedia data clip; and a control unit
configured to identify a first speaker corresponding to audio data
included in the at least one multimedia data clip, to obtain
information corresponding to the identified first speaker, and to
store the obtained information so that the obtained information
corresponds to a first time point for the at least one multimedia
data clip.
12. The electronic device of claim 11, wherein the first time point
is when the first speaker begins to speak.
13. The electronic device of claim 12, wherein the control unit is
configured to analyze audio data included in the at least one
multimedia data clip and to determine that the first time point is
when a human voice included in the audio data is sensed.
14. The electronic device of claim 11, wherein the control unit is
configured to analyze video data included in the at least one
multimedia data clip to identify the first speaker.
15. The electronic device of claim 14, wherein the control unit is
configured to identify the first speaker based on a lip motion
included in the video data.
16. The electronic device of claim 11, wherein information relating
to the first speaker includes at least one of personal information
on the first speaker, information on a place where the first
speaker is positioned, and a keyword which the first speaker
speaks.
17. An electronic device comprising: a communication unit
configured to receive at least one multimedia data clip streamed
from at least one second electronic device; a memory configured to
store the at least one multimedia data clip; a display unit
configured to display video data included in the at least one
multimedia data clip; and a control unit configured to, whenever a
speaker corresponding to audio data included in the at least one
multimedia data clip changes, store information corresponding to
the speaker so that the information corresponds to a time point
when the speaker changes, to determine a time point corresponding
to a predetermined input when the predetermined input is received,
and to call at least part of a multimedia data clip corresponding
to the determined time point from the memory.
18. The electronic device of claim 17, wherein the control unit is
configured to display both the video data included in the streamed
at least one multimedia data clip and video data included in the
called at least part of the multimedia data clip.
19. The electronic device of claim 17, further comprising a sound
output unit, wherein the control unit is configured to output
through the sound output unit at least one of audio data included
in the streamed at least one multimedia data clip and audio data
included in the called at least part of the multimedia data
clip.
20. The electronic device of claim 17, wherein the control unit is
configured to display both the video data included in the streamed
at least one multimedia data clip and text data corresponding to
audio data included in the called at least part of the multimedia
data clip.
Description
CROSS-REFERENCE TO RELATED APPLICATION
[0001] This application claims priority to Korean Patent
Application No. ______ filed on ______, the contents of which are
herein incorporated by reference in its entirety.
BACKGROUND
[0002] 1. Technical Field
[0003] Embodiments of the present invention are directed to an
electronic device and a method of operating the electronic device,
and more specifically to an electronic device that may be used for
a videoconference and a method of controlling the electronic
device.
[0004] 2. Discussion of the Related Art
[0005] Tele-presence refers to a set of technologies which allow a
person to feel as if they were present. Tele-presence technologies
reproduce information on five senses a person feels in a specific
space at a remote location. Element technologies for tele-presence
may include video, audio, tactile, and network transmission
technologies. Such tele-presence technologies are adopted for video
conference systems. Tele-presence-based video conference systems
provide higher-quality communications and allow users to further
concentrate on the conversation compared to conventional video
conference systems.
[0006] The tele-presence technologies for video conference systems,
although showing a little difference for each and every
manufacturer, may be applicable to video, audio, and network
transmission technologies as follows:
[0007] For video technologies, the tele-presence technologies apply
as generating natural eye-contact images for being able to make a
user further feel like he would face another user and generating
high-resolution images. For audio technologies, the tele-presence
technologies apply as audio playback technologies that may create a
feeling of a space based on a speaker's location. For network
transmission technologies, the tele-presence technologies apply as
real-time image/sound transmission technologies based on an MCU
(Multi Control Unit).
[0008] In contrast to video, audio, and network transmission for
video conference systems which have been actively researched, data
sharing between attendants in a conference is still not
satisfactory. Current video conference systems use a separate
monitor for data sharing. Accordingly, when a user shifts his eyes
from an image screen to a data screen, the eye contact is not
maintained lowering a feeling as if actually facing another user.
Moreover, a short drop in conversation occurs at every data
manipulation because the data manipulation is conducted by a
peripheral device, such as a mouse.
SUMMARY
[0009] Embodiments of the present invention provide an electronic
device and a method of operating the electronic device, which may
allow for a vivid video conference.
[0010] According to an embodiment of the present invention, there
is provided an electronic device including a memory, a
communication unit configured to receive a video image streamed
from a first electronic device, a display unit configured to
display the video image, and a control unit configured to identify
a specific area included in the video image, to store a first image
displayed on the specific area at a first time point in the memory,
and to store in the memory a second image displayed on the specific
area at a second time point when a variation of an image is equal
to or more than a predetermined threshold.
[0011] The control unit is configured to store the first image so
that the first image corresponds to the first time point and to
store the second image so that the second image corresponds to the
second time point.
[0012] The first and second images are still images.
[0013] The control unit is configured to determine the variation of
the image displayed on the specific area based on a variation
between the image displayed on the specific area and the first
image stored in the memory.
[0014] The control unit is configured to analyze the video image to
identify the specific area.
[0015] The control unit is configured to receive information on the
specific area from the first electronic device through the
communication unit and to identify the specific area based on the
received information.
[0016] According to an embodiment of the present invention, there
is provided to an electronic device including a memory, a
communication unit configured to receive a video image streamed
from a first electronic device, a display unit configured to
display the video image, and a control unit configured to identify
a specific area included in the video image, to store in the memory
a still image reflecting a content displayed on the specific area
whenever the content changes, so that the still image corresponds
to a time point when the content changes, to determine a time point
corresponding to a predetermined request when the request is
received, and to call a still image corresponding to the determined
time point from the memory.
[0017] The control unit is configured to display both the streamed
video image and the called still image on the display unit.
[0018] The control unit is configured to display the still image on
a second area of the video image, the second area not overlapping
the specific area.
[0019] The controller is configured to replace an image displayed
on the specific area of the streamed video image by the still image
and to display the replaced still image on the display unit.
[0020] According to an embodiment of the present invention, there
is provided an electronic device including a memory, a
communication unit configured to receive at least one multimedia
data clip from at least one second electronic device, a display
unit configured to display the at least one multimedia data clip,
and a control unit configured to identify a first speaker
corresponding to audio data included in the at least one multimedia
data clip, to obtain information corresponding to the identified
first speaker, and to store the obtained information so that the
obtained information corresponds to a first time point for the at
least one multimedia data clip.
[0021] The first time point is when the first speaker begins to
speak.
[0022] The control unit is configured to analyze audio data
included in the at least one multimedia data clip and to determine
that the first time point is when a human voice included in the
audio data is sensed.
[0023] The control unit is configured to analyze video data
included in the at least one multimedia data clip to identify the
first speaker.
[0024] The control unit is configured to identify the first speaker
based on a lip motion included in the video data.
[0025] Information relating to the first speaker includes at least
one of personal information on the first speaker, information on a
place where the first speaker is positioned, and a keyword which
the first speaker speaks.
[0026] According to an embodiment of the present invention, there
is provided an electronic device including a communication unit
configured to receive at least one multimedia data clip streamed
from at least one second electronic device, a memory configured to
store the at least one multimedia data clip, a display unit
configured to display video data included in the at least one
multimedia data clip, and a control unit configured to, whenever a
speaker corresponding to audio data included in the at least one
multimedia data clip changes, store information corresponding to
the speaker so that the information corresponds to a time point
when the speaker changes, to determine a time point corresponding
to a predetermined input when the predetermined input is received,
and to call at least part of a multimedia data clip corresponding
to the determined time point from the memory.
[0027] The control unit is configured to display both the video
data included in the streamed at least one multimedia data clip and
video data included in the called at least part of the multimedia
data clip.
[0028] The electronic device further includes a sound output unit,
wherein the control unit is configured to output through the sound
output unit at least one of audio data included in the streamed at
least one multimedia data clip and audio data included in the
called at least part of the multimedia data clip.
[0029] The control unit is configured to display both the video
data included in the streamed at least one multimedia data clip and
text data corresponding to audio data included in the called at
least part of the multimedia data clip.
[0030] The embodiments of the present invention may provide the
following effects.
[0031] First, the second user who attends the video conference at
the second place may store the image for the presentation material
without separately receiving data for the presentation material
provided for performing the conference by the first user who
conducts the video conference at the first place. The image for the
presentation material may be separately extracted from the video
image provided through the video conference and stored through the
electronic device used by the second user without any annoying
process such as previously receiving separate electronic data (for
example, electronic files) for the presentation material used for
the video conference by the first user.
[0032] Second, in the case that the conference is performed with
materials difficult to convert into data (for example, samples used
for introducing a prototype model) or the first user has not
converted presentation material used for the conference into
electronic data in advance, according to an embodiment, the
presentation material may be used for the conference while
converted into image data at the same time, so that the second user
may see again the presentation material used for the video
conference.
[0033] Third, the second user may review the previous pages of the
presentation material used for the video conference hosted by the
first user while the video conference is in progress, thereby
enabling more efficient video conference.
[0034] Fourth, the electronic device may continue to monitor the
speakers during the course of the video conference and may store
various types of information on the speakers so that the
information corresponds to the time points when the speakers begin
to speak, thereby generating metadata for video conference.
Further, the video conference metadata is used in various manners,
thus enhancing user convenience. For example, the video conference
metadata may be used to make brief proceedings for the video
conference, which are to be provided to the attendees of the
conference or may be used to provide a search function which allows
the attendees to review the conference.
[0035] Fifth, by identifying the speaker and outputting the
multimedia data clip corresponding to the speaker in a different
manner than those for the other multimedia data clips, more
attention can be oriented toward the user who is making a speech in
the video conference, thereby enabling the video conference to
proceed more efficiently.
[0036] Finally, after or while the video conference ends, specific
time points of the multimedia data for the video conference may be
searched to review the video conference, and the multimedia data
corresponding to the searched time points may be output.
BRIEF DESCRIPTION OF THE DRAWINGS
[0037] The embodiments of the present invention will become readily
apparent by reference to the following detailed description when
considered in conjunction with the accompanying drawings
wherein:
[0038] FIG. 1 is a block diagram illustrating an electronic device
according to an embodiment of the present invention;
[0039] FIG. 2 is a view illustrating an example where a user inputs
a gesture to an electronic device as shown in FIG. 1;
[0040] FIG. 3 is a view for describing an environment according to
an embodiment of the present invention;
[0041] FIG. 4 is a flowchart illustrating a method of controlling
an electronic device according to an embodiment of the present
invention;
[0042] FIG. 5 is a view illustrating a video image displayed by an
electronic device according to an embodiment of the present
invention;
[0043] FIG. 6 is a view illustrating a screen viewed after the page
of the presentation material displayed on the specific area has
changed according to an embodiment of the present invention;
[0044] FIG. 7 is a view illustrating an example of storing a
plurality of images displayed on a specific area according to an
embodiment of the present invention;
[0045] FIGS. 8 to 10 are views illustrating methods of obtaining a
predetermined request according to embodiments of the present
invention;
[0046] FIGS. 11 and 12 are views illustrating exemplary methods of
displaying an obtained image according to embodiments of the
present invention;
[0047] FIG. 13 is a view schematically illustrating an environment
to which an embodiment of the present invention applies;
[0048] FIG. 14 is a flowchart illustrating a method of controlling
an electronic device according to an embodiment of the present
invention;
[0049] FIGS. 15 to 19 are views illustrating embodiments of the
present invention;
[0050] FIG. 20 is a view illustrating an example of video
conference metadata stored according to an embodiment of the
present invention;
[0051] FIG. 21 is a flowchart illustrating a method of controlling
an electronic device according to an embodiment of the present
invention;
[0052] FIG. 22 illustrates examples of receiving the predetermined
input by various methods according to an embodiment of the present
invention; and
[0053] FIGS. 23 to 25 are views illustrating examples of outputting
the current and past multimedia data according to an embodiment of
the present invention.
DESCRIPTION OF THE EMBODIMENTS
[0054] The present invention will now be described more fully with
reference to the accompanying drawings, in which exemplary
embodiments of the invention are shown. The invention may, however,
be embodied in many different forms and should not be construed as
being limited to the embodiments set forth herein; rather, there
embodiments are provided so that this disclosure will be thorough
and complete, and will fully convey the concept of the invention to
those skilled in the art.
[0055] Hereinafter, a mobile terminal relating to the present
invention will be described below in more detail with reference to
the accompanying drawings. In the following description, suffixes
"module" and "unit" are given to components of the mobile terminal
in consideration of only facilitation of description and do not
have meanings or functions discriminated from each other.
[0056] FIG. 1 is a block diagram illustrating an electronic device
according to an embodiment of the present invention.
[0057] Referring to FIG. 1, the electronic device 100 includes a
communication unit 110, a user input unit 120, an output unit 150,
a memory 160, an interface unit 170, a control unit 180, and a
power supply unit 190. The components shown in FIG. 1 may be
components that may be commonly included in an electronic device.
Accordingly, more or less components may be included in the
electronic device 100.
[0058] The communication unit 110 may include one or more modules
that enable communication between the electronic device 100 and a
communication system or between the electronic device 100 and
another device. For instance, the communication unit 110 may
include a broadcast receiving unit 111, an Internet module 113, and
a near-field communication module 114.
[0059] The broadcast receiving unit 111 receives broadcast signals
and/or broadcast-related information from an external broadcast
managing server through a broadcast channel.
[0060] The broadcast channel may include a satellite channel and a
terrestrial channel The broadcast managing server may refer to a
server that generates broadcast signals and/or broadcast-related
information and broadcasts the signals and/or information or a
server that receives pre-generated broadcast signals and/or
broadcast-related information and broadcasts the signals and/or
information to a terminal. The broadcast signals may include TV
broadcast signals, radio broadcast signals, data broadcast signals
as well as combinations of TV broadcast signals or radio broadcast
signals and data broadcast signals.
[0061] The broadcast-related information may refer to information
relating to broadcast channels, broadcast programs, or broadcast
service providers. The broadcast-related information may be
provided through a communication network.
[0062] The broadcast-related information may exist in various
forms, such as, for example, EPGs (Electronic Program Guides) of
DMB (Digital Multimedia Broadcasting) or ESGs (Electronic Service
Guides) of DVB-H (Digital Video Broadcast-Handheld).
[0063] The broadcast receiving unit 111 may receive broadcast
signals using various broadcast systems. Broadcast signals and/or
broadcast-related information received through the broadcast
receiving unit 111 may be stored in the memory 160.
[0064] The Internet module 113 may refer to a module for access to
the Internet. The Internet module 113 may be provided inside or
outside the electronic device 100.
[0065] The near-field communication module 114 refers to a module
for near-field communication. Near-field communication technologies
may include Bluetooth, RFID (Radio Frequency Identification), IrDA
(Infrared Data Association), UWB (Ultra Wideband), and ZigBee
technologies.
[0066] The user input unit 120 is provided for a user's entry of
audio or video signals and may include a camera 121 and a
microphone 122.
[0067] The camera 121 processes image frames including still images
or videos as obtained by an image sensor in a video call mode or
image capturing mode. The processed image frames may be displayed
by the display unit 151. The camera 121 may perform 2D or 3D image
capturing or may be configured as one or a combination of 2D and 3D
cameras.
[0068] The image frames processed by the camera 121 may be stored
in the memory 160 or may be transmitted to an outside device
through the communication unit 110. According to an embodiment, two
or more cameras 121 may be included in the electronic device
100.
[0069] The microphone 122 receives external sound signals in a call
mode, recording mode, or voice recognition mode and processes the
received signals as electrical voice data. The microphone 122 may
perform various noise cancelling algorithms to remove noises
created when receiving the external sound signals. A user may input
various voice commands through the microphone 122 to the electronic
device 100 to drive the electronic device 100 and to perform
functions of the electronic device 100.
[0070] The output unit 150 may include a display unit 151 and a
sound output unit 152.
[0071] The display unit 151 displays information processed by the
electronic device 100. For example, the display unit 151 displays a
UI (User Interface) or GUI (Graphic User Interface) associated with
the electronic device 100. The display unit 151 may be at least one
of a liquid crystal display, a thin film transistor liquid crystal
display, an organic light emitting diode display, a flexible
display, and a 3D display. The display unit 151 may be configured
in a transparent or light transmissive type, which may be called a
"transparent display" examples of which include transparent LCDs.
The display unit 151 may have a light-transmissive rear structure
in which a user may view an object positioned behind the terminal
body through an area occupied by the display unit 151 in the
terminal body.
[0072] According to an embodiment, two or more display units 151
may be included in the electronic device 100. For instance, the
electronic device 100 may include a plurality of display units 151
that are integrally or separately arranged on a surface of the
electronic device 100 or on respective different surfaces of the
electronic device 100.
[0073] When the display unit 151 and a sensor sensing a touch
(hereinafter, referred to as a "touch sensor") are layered (this
layered structure is hereinafter referred to as a "touch sensor"),
the display unit 151 may be used as an input device as well as an
output device. The touch sensor may include, for example, a touch
film, a touch sheet, or a touch pad.
[0074] The touch sensor may be configured to convert a change in
pressure or capacitance, which occurs at a certain area of the
display unit 151, into an electrical input signal. The touch sensor
may be configured to detect the pressure exerted during a touch as
well as the position or area of the touch.
[0075] Upon touch on the touch sensor, a corresponding signal is
transferred to a touch controller. The touch controller processes
the signal to generate corresponding data and transmits the data to
the control unit 180. By doing so, the control unit 180 may
recognize the area of the display unit 151 where the touch
occurred.
[0076] The sound output unit 152 may output audio data received
from the communication unit 110 or stored in the memory 160. The
sound output unit 152 may output sound signals associated with
functions (e.g., call signal receipt sound, message receipt sound,
etc.) performed by the electronic device 100. The sound output unit
152 may include a receiver, a speaker, and a buzzer.
[0077] The memory 160 may store a program for operation of the
control unit 180, and may preliminarily store input/output data
(for instance, phone books, messages, still images, videos, etc.).
The memory 160 may store data relating to vibrations and sounds
having various patterns, which are output when the touch screen is
touched.
[0078] The memory 160 may include at least one storage medium of
flash memory types, hard disk types, multimedia card micro types,
card type memories (e.g., SD or XD memories), RAMs (Random Access
Memories), SRAM (Static Random Access Memories), ROMs (Read-Only
Memories), EEPROMs (Electrically Erasable Programmable Read-Only
Memories), PROM (Programmable Read-Only Memories), magnetic
memories, magnetic discs, and optical discs. The electronic device
100 may operate in association with a web storage performing a
storage function of the memory 160 over the Internet.
[0079] The interface unit 170 functions as a path between the
electronic device 100 and any external device connected to the
electronic device 100. The interface unit 170 receives data or
power from an external device and transfers the data or power to
each component of the electronic device 100 or enables data to be
transferred from the electronic device 100 to the external device.
For instance, the interface unit 170 may include a wired/wireless
headset port, an external recharger port, a wired/wireless data
port, a memory card port, a port connecting a device having an
identification module, an audio I/O (Input/Output) port, a video
I/O port, and an earphone port.
[0080] The control unit 180 controls the overall operation of the
electronic device 100. For example, the control unit 180 performs
control and processes associated with voice call, data
communication, and video call. The control unit 180 may include an
image processing unit 182 for image process. The image processing
unit 182 is described below in relevant parts in greater
detail.
[0081] The power supply unit 190 receives internal or external
power under control of the control unit 180 and supplies the power
to each component for operation of the component.
[0082] The embodiments described herein may be implemented in
software or hardware or in a combination thereof, or in a recording
medium readable by a computer or a similar device to the computer.
When implemented in hardware, the embodiments may use at least one
of ASICs (application specific integrated circuits), DSPs (digital
signal processors), DSPDs (digital signal processing devices), PLDs
(programmable logic devices), FPGAs (field programmable gate
arrays, processors, controllers, micro-controllers,
microprocessors, and electrical units for performing functions.
According to an embodiment, the embodiments may be implemented by
the control unit 180.
[0083] When implemented in software, some embodiments, such as
procedures or functions, may entail a separate software module for
enabling at least one function or operation. Software codes may be
implemented by a software application written in proper programming
language. The software codes may be stored in the memory 160 and
may be executed by the control unit 180.
[0084] FIG. 2 is a view illustrating an example where a user inputs
a gesture to an electronic device as shown in FIG. 1.
[0085] Referring to FIG. 2, the electronic device 100 may capture
the gesture of the user U and may perform a proper function
corresponding to the gesture.
[0086] The electronic device 100 may be any electronic device
having the display unit 151 that can display images. The electronic
device 100 may be a stationary terminal, such as a TV shown in FIG.
2, which is bulky and thus placed in a fixed position, or may be a
mobile terminal such as a cell phone. The electronic device 100 may
include the camera 121 that may capture the gesture of the user
U.
[0087] The camera 121 may be an optical electronic device that
performs image capturing in a front direction of the electronic
device 100. The camera 121 may be a 2D camera for 2D image
capturing and/or a 3D camera for 3D image capturing. Although in
FIG. 2 one camera 121 is provided at a top central portion of the
electronic device 100 for ease of description, the number,
location, and type of the camera 121 may vary as necessary.
[0088] The control unit 180 may trace a user U having a control
right when discovering the user U. The issue and trace of the
control right may be performed based on an image captured by the
camera 121. For example, the control unit 180 may analyze a
captured image and continuously determine whether there a specific
user U exists, whether the specific user U performs a gesture
necessary for obtaining the control right, and whether the specific
user U moves or not.
[0089] The control unit 180 may analyze a gesture of a user having
the control right based on a captured image. For example, when the
user U makes a predetermined gesture but does not own the control
right, no function may be conducted. However, when the user U has
the control right, a predetermined function corresponding to the
predetermined gesture may be conducted.
[0090] The gesture of the user U may include various operations
using his/her body. For example, the gesture may include the
operation of the user sitting down, standing up, running, or even
moving. Further, the gesture may include operations using the
user's head, foot, or hand H. For convenience of illustration, a
gesture of using the hand H of the user U is described below as an
example. However, the embodiments of the present invention are not
limited thereto.
[0091] According to an embodiment, analysis of a hand gesture may
be conducted in the following ways.
[0092] First, the user's fingertips are detected, the number and
shape of the fingertips are analyzed, and then converted into a
gesture command.
[0093] The detection of the fingertips may be performed in two
steps.
[0094] First, a step of detecting a hand area may be performed
using a skin tone of a human. A group of candidates for the hand
area is designated and contours of the candidates are extracted
based on the human's skin tone. Among the candidates, a candidate
the contour of which has the same number of points as a value in a
predetermined range may be selected as the hand.
[0095] Secondly, as a step of determining the fingertips, the
contour of the candidate selected as the hand is run around and a
curvature is calculated based on inner products between adjacent
points. Since the fingertips show sharp variation of their
curvatures, when a change in a curvature of a fingertip exceeds a
threshold value, the fingertip is chosen as a fingertip of the
hand. The fingertips thusly extracted may be converted into
meaningful commands during gesture-command conversion.
[0096] According to an embodiment, it is often necessary with
respect to a gesture command for a synthesized virtual 3D image (3D
object) to judge whether a contact has occurred between the virtual
3D image and a user's gesture. For example, it may be necessary, as
is often case, whether there is a contact between an actual object
and a virtual object to manipulate the virtual object interposed in
the actual object.
[0097] Whether the contact is present or not may be determined by
various collision detection algorithms. For instance, a rectangle
bounding box method and a bounding sphere method may be adopted for
such judgment.
[0098] The rectangle bounding box method compares areas of
rectangles surrounding a 2D object for collision detection. The
rectangle bounding box method has merits such as being less burden
in calculation and easy to follow. The bounding sphere method
determines whether there is collision or not by comparing radii of
spheres surrounding a 3D object.
[0099] For example, a depth camera may be used for manipulation of
a real hand and a virtual object. Depth information of the hand as
obtained by the depth camera is converted into a distance unit for
a virtual world for purposes of rendering of the virtual image, and
collision with the virtual object may be detected based on a
coordinate.
[0100] Hereinafter, an exemplary environment in which the
embodiments of the present invention are implemented is described.
FIG. 3 is a view for describing an environment according to an
embodiment of the present invention.
[0101] Referring to FIG. 3, a first user U1 and a second user U2
are positioned in a first place and a second place, respectively.
The first user U1 may be a person who hosts a video conference
and/or provides lectures to a number of other people including the
second user U2, and the second user U2 may be a person who attends
the video conference hosted by the first user U1.
[0102] A voice and/or motion of the first user U1 may be obtained
and converted into video data and/or audio data by an electronic
device 200 arranged in the first place. Further, the video data
and/or audio data may be transferred through a predetermined
network (communication network) to another electronic device 300
positioned in the second place. The first electronic device 300 may
output the transferred video data and/or audio data through an
output unit in a visual or auditory manner. The first electronic
device 300 and the first electronic device 300 each may be the same
or substantially the same as the electronic device 100 described in
connection with FIG. 1. However, according to an embodiment, each
of the first electronic device 300 and the first electronic device
300 may include only some of the components of the electronic
device 100. According to an embodiment, the components of the first
electronic device 300 may be different from the components of the
first electronic device 300.
[0103] FIG. 3 illustrates an example where the first electronic
device 300 obtains and transfers the video data and/or audio data
and the first electronic device 300 outputs the transferred video
data and/or audio data. According to an embodiment, the first
electronic device 300 and the first electronic device 300 may
switch to each other in light of functions and operations, or
alternatively, each of the first electronic device 300 and the
first electronic device 300 may perform the whole functions
described above.
[0104] For example, the first user U1 may transfer his image and/or
voice through the first electronic device 300 to the first
electronic device 300 and may receive and output an image and/or
voice of the second user U2. Likewise, the first electronic device
300 may also perform the same functions and operations as the first
electronic device 300.
[0105] Hereinafter, a method of controlling an electronic device
according to an embodiment of the present invention is described.
For purposes of illustration, the control method is performed by
the electronic device 100 described in connection with FIG. 1. As
used herein, the "first electronic device" refers to the electronic
device 300 shown in FIG. 3, which is positioned in the second
place, and the "second electronic device" refers to the electronic
device 300 shown in FIG. 3, which is positioned in the first place.
However, the embodiments of the present invention are not limited
thereto.
[0106] FIG. 4 is a flowchart illustrating a method of controlling
an electronic device according to an embodiment of the present
invention.
[0107] Referring to FIG. 4, the control method of an electronic
device may include a step of receiving a video image from a second
electronic device 200 (S100), a step of displaying the video image
on the display unit 151 (S110), a step of identifying a specific
area of the video image (for example, an area where a presentation
material necessary for performing a video conference is displayed)
(S120), and a step of storing an image (e.g., first image)
displayed on the specific area (S130).
[0108] The first electronic device 300 may further include a step
of determining whether a variation of the image displayed on the
specific area is equal to or larger than a predetermined threshold
(S140) and a step of storing an image (e.g., second image)
displayed on the specific area when the variation of the image is
equal to or larger than the predetermined threshold (S150). When
the variation of the image is smaller than the predetermined
threshold, the first electronic device 300 may continue to monitor
whether the variation of the image becomes equal to or larger than
the threshold (S140).
[0109] The first electronic device 300 may continuously display
video images received from the second electronic device 200 on the
display unit 151. When the first electronic device 300 receives a
predetermined request while performing steps 5100 to S150 (S160),
the first electronic device 300 obtains an image corresponding to
the request among images stored in the memory 160 in steps S130
and/or S150 (S170) and displays the obtained image on the display
unit 151 (S180). Hereinafter, the steps are described in greater
detail.
[0110] The first electronic device 300 positioned in the second
place may receive a video image from the second electronic device
200 (S100). The video image may be streamed from the second
electronic device 200 to the first electronic device 300.
[0111] The video image may be obtained by the second electronic
device 200. For instance, the video image may include a scene
relating to a video conference performed by the first user U1 or a
scene relating to an online lecture conducted by the first user U1
as obtained by the second electronic device 200.
[0112] The video image may be a video image that is obtained by the
camera 121 included in the second electronic device 200 and
reflects a real situation. The video image may be a composite image
of a virtual image and a video image reflecting a real situation.
At least part of the video image reflecting the real situation may
be replaced by another image.
[0113] The video image may be directly transmitted from the second
electronic device 200 to the first electronic device 300 or may be
transmitted from the second electronic device 200 to the first
electronic device 300 via a server (not shown).
[0114] The first electronic device 300 may generate a control
signal for visually representing a video image (S110). The first
electronic device 300 may visually output the video image through
the display unit 151 or a beam projector (not shown) according to
the control signal.
[0115] The first electronic device 300 may identify a specific area
included in the video image (S120). For instance, the first
electronic device 300 may identify an area which displays a
presentation material necessary for performing a video
conference.
[0116] FIG. 5 is a view illustrating a video image displayed by an
electronic device according to an embodiment of the present
invention. Referring to FIG. 5, it can be seen that as shown in
FIG. 3 a video image obtained by the second electronic device 200
for the first user U1 at the first place may be displayed through
the first electronic device 300. A first image I1 for a
presentation material (or lecture material, which is jointly
referred to as "presentation material") necessary for a video
conference and/or lecture may be displayed on a specific area SA.
The first image I1 may be an image reflecting an actual situation.
Alternatively, the first image I1 may be a composite image made by
the second electronic device 200.
[0117] The first electronic device 300 may identify the specific
area SA on which the presentation material is displayed as
described above. The first electronic device 300 may employ various
methods to identify the specific area SA. For example, the first
electronic device 300 may use an image processing technology to
analyze the video image and to identify an area on which marks such
as letters and/or diagrams are intensively displayed, so that the
specific area SA may be noticed. As another example, the second
electronic device 200 may transmit location information of the
specific area SA to the first electronic device 300 together with
or separately from the video image upon transmission of the video
image. The first electronic device 300 may identify the specific
area SA based on the transmitted location information.
[0118] Subsequently, the first electronic device 300 may store the
image displayed on the specific area (S130). For example, the first
electronic device 300 may store the image for the presentation
material included in the video image. The image may be stored in
the memory 160.
[0119] The presentation material may include a number of pages or
may include a video material. The image displayed on the specific
area, which is stored in step S130, may be a still image for part
of the presentation material displayed on the specific area at a
particular time point. For example, in the case that the
presentation material includes several pages, the image stored in
step S130 (hereinafter, referred to as "a first image") may be an
image for a particular page that is displayed at a time point when
step 120 and/or step S130 are performed among the pages included in
the presentation material. In the case that the presentation
material includes a video, the image stored in step S130 may be an
image for a particular frame displayed at a time point when step
S120 and/or step S130 are performed among a plurality of frames
included in the movie (presentation material).
[0120] The second user who attends the video conference at the
second place may store the image for the presentation material
without separately receiving data for the presentation material
provided for performing the conference by the first user who
conducts the video conference at the first place. The image for the
presentation material may be separately extracted from the video
image provided through the video conference and stored through the
electronic device used by the second user without any annoying
process such as previously receiving separate electronic data (for
example, electronic files) for the presentation material used for
the video conference by the first user.
[0121] For example, in the case that the conference is performed
with materials difficult to convert into data (for example, samples
used for introducing a prototype model) or the first user has not
converted presentation material used for the conference into
electronic data in advance, according to an embodiment, the
presentation material may be used for the conference while
converted into image data at the same time, so that the second user
may see again the presentation material used for the video
conference.
[0122] The first electronic device 300 determines whether the
variation of the image displayed on the specific area is equal to
or more than a predetermined threshold (S140), and when the
variation is determined to be not less than the threshold, the
first electronic device 300 may store the image displayed on the
specific area (S150). However, when the variation is less than the
threshold, the first electronic device 300 may continue to monitor
whether the variation is equal to or more than the threshold
(S140). When the variation is less than the threshold, the first
electronic device 300 may keep monitoring any change to the image
without separately storing the image.
[0123] To perform step S140, the first electronic device 300
receives and displays the video image and continues to monitor the
specific area. The first electronic device 300 may continuously
perform step S140 and monitor whether there is any change to the
presentation material displayed on the specific area (e.g., content
displayed on the specific area).
[0124] For example, in the case that the first user U1 changes the
presentation material from a first material to a second material
while performing the conference at the first place, the first
electronic device 300 may sense a change to the image displayed on
the specific area (S140) and may store the image displayed on the
specific area after such change separately from the first image
stored in step S130 (S150).
[0125] As another example, in the case that the presentation
material include a plurality of pages, when the first user U1
changes the presentation material from an Nth page to an N+1th
page, the first electronic device 300 may sense a change to the
image displayed on the specific area (S140) and may store the image
displayed on the specific area after such change separately from
the first image stored in step S130 (S150). FIG. 6 is a view
illustrating a screen viewed after the page of the presentation
material displayed on the specific area has changed according to an
embodiment of the present invention. Referring to FIGS. 5 and 6, it
can be seen that the image displayed on the specific area SA has
changed from the first image I1 to the second image I2. According
to an embodiment, the first electronic device 300 may store the
second image I2 in the memory 160 separately from the first image
I1.
[0126] For example, in the case that the presentation material is a
video material, the first electronic device 300 may sense a change
to the image displayed on the specific area (S140) and may store
the image displayed on the specific area after such change
separately from the first image stored in step S130 (S150). For
example, an image corresponding to an Nth frame of the video
material is stored as the first image, and an image corresponding
to an N+ath frame in which a variation of the image corresponding
to the Nth frame is equal to or more than a predetermined threshold
may be stored in step S150 (where, a is an integer equal to or more
than 1). For example, when a difference (variation) between the
image corresponding o the Nth frame and images corresponding to the
N+1th frame and the N+2th frame does not exceed the threshold, the
first electronic device 300 does not sore the images corresponding
to the N+1th and N+2th frames of the video material. However, when
a change (variation) between the image corresponding to the Nth
frame and an image corresponding to the N+3th frame of the video
material is in excess of the threshold, the first electronic device
300 stores the image corresponding to the N+3th frame in step S150.
Accordingly, even when the presentation material provided by the
first user is a video material, an image corresponding to a frame
positioned at a border where the image changes a lot may be stored
in the first electronic device 300, so that the second user may
review the presentation material later.
[0127] The first electronic device 300 may compare an image
displayed in real time on the specific area with the first image
stored in step S130 (or as described below, an image right coming
right before the first image when steps S140 and S150 are repeated)
and may yield a variation. For example, in the case that the
presentation material is a video, the image currently displayed on
the specific area is an image corresponding to the Nth frame, and
the stored first image (or image stored immediately before the
first image) is an image corresponding to the N-5th frame, the
first electronic device 300 may compare the image corresponding to
the Nth frame with the image corresponding to the N-5th frame
rather than corresponding to the N-1th frame and may produce a
variation.
[0128] Subsequently, the first electronic device 300 repeats steps
S140 and S150 and stores the image displayed on the specific area
SA in the memory 160 whenever the image changes by more than the
threshold. The first electronic device 300 may store the plurality
of images stored in steps S130 and/or S150 in order of storage.
[0129] FIG. 7 is a view illustrating an example of storing a
plurality of images displayed on a specific area according to an
embodiment of the present invention.
[0130] In storing the plurality of images, the first electronic
device 300 may number images stored while the video conference is
performed in order of storage as shown in (a) of FIG. 7 and may
store the images that respectively correspond to the numbers. For
example, it can be seen from (a) of FIG. 7 that the first image I1
is stored Nth and the second image I2 is stored N+1th and that the
first image I1 corresponds to a value "N" and the second image I2
corresponds to a value "N+1".
[0131] In storing the plurality of images, the first electronic
device 300, as shown in (b) of FIG. 7, may obtain time information
on times when the respective images are stored on a time line that
starts counting when a video conference begins and may store the
obtained time information so that the information corresponds to
the respective images. For example, it can be seen from (b) of FIG.
7 that the first image I1 is stored one minute and two seconds
after the video conference has begun, and the second image I2 is
stored two minutes and ten seconds after the video conference has
begun and that the first image I1 corresponds to time information
of "1 minute and 2 seconds" and the second image I2 corresponds to
time information of "2 minute and 10 seconds".
[0132] While performing steps S120 to S150, the first electronic
device 300 may continue to display the video image received from
the second electronic device 200 on the display unit 151. While
continuously performing steps S100 to S150, the first electronic
device 300 may receive a predetermined request (S160).
[0133] The second user U2 may want to review the presentation
material that has been just explained by the user U1 while viewing
the video conference hosted by the first user U1. In this case, the
second user U2 may input a predetermined request to the first
electronic device 300. Alternatively, the second user U2 may input
the predetermined request to another electronic device (not shown)
wirelessly or wiredly connected to the first electronic device 300,
and the other electronic device may transfer the predetermined
request and/or the fact that the predetermined request has been
generated to the first electronic device 300. Hereinafter, unless
stated otherwise, it is assumed that the second user U2 directly
inputs the predetermined request to the first electronic device
300.
[0134] The predetermined request may be input by various
methods.
[0135] FIGS. 8 to 10 are views illustrating methods of obtaining a
predetermined request according to embodiments of the present
invention.
[0136] Referring to FIG. 8, the second user U2 makes a particular
gesture (of moving his right arm from right to left). The first
electronic device 300 may previously have the specific gesture
shown in FIG. 8 correspond to the predetermined request, and when
recognizing the second user's gesture shown in FIG. 8, may
determine that the predetermined request is input.
[0137] Referring to FIG. 9, the second user U2 generates a specific
voice command (for example, saying "to the previous page" or "to
the Nth page"). The first electronic device 300 may previously have
the voice commands shown in FIG. 9 correspond to the predetermined
request, and when recognizing that the specific voice command is
input from the second user U2, may determine that the predetermined
request is input.
[0138] Referring to FIG. 10, the first electronic device 300 may
display a specific control button CB on the display unit 151 that
is displaying the movie image. The second user U2 may select the
control button CB by touch and/or by input using a mouse. Receiving
the input selected for the control button CB, the first electronic
device 300 may determine that the predetermined request is
input.
[0139] Subsequently, the first electronic device 300 may obtain an
image corresponding to the received request among the images stored
in the memory 160 in steps S130 and/or S150 (S170) and may display
the obtained image on the display unit 151 (S180).
[0140] Obtaining the image corresponding to the predetermined
request in step S170 may be performed by various methods.
[0141] For example, as described in connection with FIG. 8, in the
case that the predetermined request is input by a user's gesture,
when the user once makes a gesture as shown in FIG. 8, assuming
that the currently displayed image is stored Nth, the N-1th stored
image may be acquired in step S170, and when the user makes the
gesture two times, the N-2th stored image may be obtained in step
S170.
[0142] As another example, in the case that the predetermined
request is input by a user's voice command (e.g., when the user
says "to the previous page") as described in connection with (a) of
FIG. 9, when the user once generates the voice command ("to the
previous page") as shown in (a) of FIG. 9, the N-1th stored image
may be acquired in step S170, and when the user generates the voice
command two times, the N-2th stored image may be obtained in step
S170.
[0143] As still another example, as described in connection with
(b) of FIG. 9, in the case that the predetermined request is input
by a user's voice command (e.g., when the user speaks "to the Nth
page"), an image corresponding to the page targeted by the user's
voice command may be obtained in step S170.
[0144] As yet still another example, in the case that the
predetermined request is input by the control button CB separately
displayed on the display unit 151 as described in connection with
FIG. 10, when the control button CB is selected once, the N-1th
stored image may be obtained in step S170 (when the currently
displayed image is the Nth stored image), and when the control
button CB is selected twice, the N-2th stored image may be obtained
in step S170.
[0145] As described above, the obtained image may be displayed on
the display unit 151. The first electronic device 300 may display
the obtained image by various methods.
[0146] FIGS. 11 and 12 are views illustrating exemplary methods of
displaying an obtained image according to embodiments of the
present invention.
[0147] Referring to FIG. 11, (a) of FIG. 11 illustrates an example
where while the first user U1 holds a video conference with the xth
page of the presentation material having a plurality of pages, a
video image reflecting the video conference is output through the
first electronic device 300. As described above, the xth page I3 of
the presentation material may be displayed on the specific area SA.
In this case, when the second user U2 inputs a voice command by
saying "to the first page", the first electronic device 300 may
keep displaying the video image received from the second electronic
device 200 on the display unit 151 while displaying on the specific
area SA the image I1 corresponding to the first page among the
images stored in the memory 160 as shown in (b) of FIG. 11 instead
of the presentation material currently received from the second
electronic device 200.
[0148] Referring to FIG. 12, (a) of FIG. 12 illustrates the same
situation as in (a) of FIG. 11. A voice command may be input by the
second user U2 saying "to the second page". On the contrary to
those described in connection with FIG. 11, an image 13
corresponding to the presentation material currently received from
the second electronic device 200 is continuously displayed on the
specific area SA while an image 12 corresponding to the second
page, which is obtained by the predetermined request, may be
displayed on a region R of the display unit 151. Accordingly, the
second user U2 may review the previous page of the presentation
material while simultaneously continuing to view the video
conference held by the first user U1.
[0149] As such, the second user may review the previous pages of
the presentation material used for the video conference hosted by
the first user while the video conference is in progress, thereby
enabling more efficient video conference.
[0150] A method of controlling an electronic device according to an
embodiment of the present invention is now described.
[0151] FIG. 13 is a view schematically illustrating an environment
to which an embodiment of the present invention applies, and FIG.
14 is a flowchart illustrating a method of controlling an
electronic device according to an embodiment of the present
invention.
[0152] Referring to FIG. 13, third to sixth users U3, U4, U5, and
U6 attend a video conference through a third electronic device 400
located at a third place, and seventh and eighth users U7 and U8
attend the video conference through a fourth electronic device 500
located at a fourth place. In this embodiment described in
connection with FIG. 13, in addition to the electronic devices 200
and 300 as described in connection with FIG. 3, more electronic
devices participate in the video conference. However, this is
merely an example for ease of description, and the embodiments of
the present invention are not limited thereto.
[0153] In the environment as illustrated in FIG. 13, the
"electronic device" refers to the first electronic device 300
located at the second place, and the "(an)other electronic
device(s)" refer(s) to at least one of the second electronic device
200 located at the first place, the third electronic device 400
located at the third place, and the fourth electronic device 500
located at the fourth place unless stated otherwise.
[0154] As shown in FIG. 14, the control method may include a step
of receiving multimedia data obtained by the electronic devices
200, 400, and 500 (S200), a step of sensing a human voice from the
received movie data at a first time point (S210), a step of
identifying a first speaker corresponding to the sensed voice
(S220), a step of obtaining information relating to the identified
speaker (S230), and a step of storing the obtained information so
that the obtained information corresponds to the first time point
and storing the information (S240).
[0155] According to an embodiment, the control method may further
include a step of determining whether the speaker corresponding to
the human voice included in the movie data changes from the first
speaker to a second speaker (S250), a step of, when the speaker
changes into the second speaker, identifying the changed second
speaker (S260), a step of obtaining information relating to the
second speaker (S270), and a step of storing the obtained
information so that the obtained information corresponds to a
second time point when the speaker changes to the second speaker
(S280). The information relating to the first speaker and/or second
speaker may include personal information of each speaker,
information on the place where each speaker is positioned during
the course of the video conference, and keywords included in the
speech which each speaker makes. Each step is now described in
greater detail.
[0156] FIGS. 15 to 19 are views illustrating embodiments of the
present invention.
[0157] The first electronic device 300 may receive multimedia data
obtained by the other electronic devices 200, 400, and 500 (S200).
The first electronic device 300 may receive first multimedia data
obtained for the first user U1 attending the video conference at
the first place, third multimedia data obtained for the third to
sixth users U3, U4, U5, and U6 attending the video conference at
the third place, and fourth multimedia data obtained for the
seventh and eighth users U7 and U8 attending the video conference
at the fourth place.
[0158] The multimedia data may include video data reflecting in
real time each user attending the video conference and images
surrounding the users as well as audio data reflecting in real time
each user attending the video conference and sound surrounding the
users.
[0159] The first electronic device 300 may output the received
multimedia data through the output unit 150 in real time. For
example, the video data included in the multimedia data may be
displayed on the display unit 151, and the audio data included in
the multimedia data may be audibly output through the sound output
unit 152. According to an embodiment, while the received multimedia
data is displayed, the second multimedia data obtained for the
second user U2 and his surroundings directly by the camera 121
and/or microphone 122 of the first electronic device 300 may be
also displayed. Hereinafter, unless stated otherwise, the
"multimedia data obtained by the first electronic device 300"
includes multimedia data obtained by and received from the
electronic devices 200, 400, and 500 and multimedia data directly
obtained by the user input unit 120 of the first electronic device
300.
[0160] The received multimedia data may be stored in the memory
160.
[0161] The first electronic device 300 may display all or at least
selected one of the video data clips included in the multimedia
data obtained by the first electronic device 300 on the display
unit 151. Likewise, all or at least selected one of the audio data
clips included in the multimedia data may also be output through
the sound output unit 152. As used herein, the "multimedia data
clip" may refer to part of the multimedia data, the "video data
clip" may refer to part of the video data, and the "audio data
clip" may refer to part of the audio data.
[0162] As described above, while outputting the multimedia data
obtained by the first electronic device 300 through the output unit
150, the first electronic device 300 may sense a human voice by
analyzing the audio data included in at least one multimedia data
clip of the multimedia data (S210). Hereinafter, the time when the
human voice is sensed in step S210 is referred to as "first time
point".
[0163] For example, as shown in FIG. 15 and (a) of FIG. 16, the
fifth user U5 positioned at the third place may start to speak at a
first time point. The first electronic device 300 may sense that
the fifth user has started to speak at the first time point by
analyzing the multimedia data (in particular, audio data) received
from the electronic device 400. (b) of FIG. 16 illustrates an
example where the multimedia data is output through the first
electronic device 300.
[0164] Subsequently, the first electronic device 300 may identify a
speaker corresponding to the sensed voice (S220). For example, the
first electronic device 300 may identify which user has generated
the voice among the first to eighth users U1, U2, U3, U4, U5, U6,
U7, and U8 that attend the video conference. To identify the
speaker corresponding to the sensed voice, the first electronic
device 300 may identify which electronic device has sent the
multimedia data including the voice among the electronic devices
200, 400, and 500. For example, in the case that the voice is
included in the multimedia data received from the second electronic
device 200 located at the first place, the first electronic device
300 may determine that the speaker corresponding to the voice is
the first user U1 located at the first place.
[0165] According to an embodiment, the first electronic device 300
may analyze the video data included in the multimedia data to
identify the speaker corresponding to the sensed voice. According
to an embodiment, the first electronic device 300 may analyze
images for respective users reflected by the video data after the
voice has been sensed to determine which user has generated the
voice. For example, the first electronic device 300 may determine
the current speaker by recognizing each user's face and analyzing
the recognized face (e.g., each user's lips). For example, as shown
in (b) of FIG. 16, the first electronic device 300 may analyze each
user's face and when recognizing that the fifth user's lips move,
may determine that the fifth user U5 is the current speaker
SP1.
[0166] The first electronic device 300 may use both the method of
identifying the electronic device that has sent the multimedia data
and the method of analyzing the video data included in the
multimedia data to identify the speaker corresponding to the sensed
voice.
[0167] It is not necessary to perform step S210 and/or S220 by the
first electronic device 300. According to an embodiment, step S210
and/or S220 may be performed by the other electronic devices 200,
400, and 500. For example, each electronic device 200, 400, or 500
may determine whether a voice is included in multimedia data it
receives, and if the voice is determined to be included, may
analyze the video data included in the multimedia data to determine
who is the speaker corresponding to the sensed voice as described
above. If step S210 and/or S220 is performed by each of the
electronic devices 200, 400, and 500, information on the speaker
determined by the electronic devices 200, 400, and 500 may be
transmitted to the first electronic device 300 along with the
multimedia data. Information on the time point when the voice was
sensed may be also transmitted to the first electronic device
300.
[0168] The first electronic device 300 may then obtain information
relating to the identified speaker (S230). Such information may be
diverse. For example, the information may include personal
information on the speaker, information on the place where the
speaker was positioned during the video conference, and keywords
which the speaker has spoken.
[0169] The personal information on the speaker may be obtained from
a database of personal information for the conference attendees
which are previously stored. The database may be provided in the
first electronic device 300 or at least one of the electronic
devices 200, 400, and 500, or distributtedly established in the
electronic devices 200, 300, 400, and 500. Or the database may be
provided in a server (not shown) connected over a communication
network. The personal information may include names, job positions,
and divisions of, e.g., the conference attendees.
[0170] The first electronic device 300 may receive information on
place where the speaker is located from the electronic devices 200,
400, and 500. Or, the first electronic device 300 may obtain the
place information based on IP addresses used by the electronic
devices 200, 400, and 500 for the video conference. The place
information may include any information that are conceptually
discerned and may distinguish one place from another, as well as
any information, such as addresses for geographically specifying
locations. For example, the place information may include an
address, such as "xxx, Yeoksam-dong, Gangnam-gu, Seoul", a team
name, such as "Financial Team" or "IP group", a branch name, such
as "US branch of XX company" or "Chinese branch of XX company", or
a company name, such as "A corporation" or "B corporation".
[0171] While the identified speaker makes a speech, the first
electronic device 300 may analyze what the speaker says about,
determine words, phrases, or sentences repeatedly spoken, and
consider the repeatedly spoken words, phrases, or sentences as
keywords of the speech the speaker has made. The first electronic
device 300 may directly analyze audio data reflecting what the
speaker has spoken or may convert the audio data into text data
through an STT (Speech-To-Text) engine and analyze the converted
text data.
[0172] Subsequently, the first electronic device 300 may store the
obtained information so that the obtained information corresponds
to the first time point (S240).
[0173] The first time point may be information specifying a time
determined on a time line whose counting commences when the video
conference begins. For example, if the fifth user U5 starts to
speech 15 seconds after the video conference has commenced, the
first time point may be the "15 seconds".
[0174] In the example described in connection with FIGS. 15 and 16,
in the case that the name of the fifth user U5, which is recognized
as the first speaker SP1, is "Mike", he belongs to "Intellectual
Property Strategy Group" (or simply "IP group"), his position is
"Manager", he attends the video conference at the "Third place", a
keyword he's spoken is "Tele-presence Technology", the first
electronic device 300 may store the "Mike", "Intellectual Property
Strategy Group", "Manager", "Third place", and "Tele-presence
Technology" so that the information corresponds to the first time
point.
[0175] Hereinafter, a set of the information (e.g., personal
information of users, place information, keywords, etc.) stored
corresponding to the time point when the speaker begins to speak
(e.g., the first time point as in the above example) is referred to
as "metadata", and the name, division, position, place information,
and keyword are referred to as fields of the metadata. In the
above-described example, it has been described that the metadata
for the video conference includes the personal information, place
information, and keywords. However, this is merely an example, and
according to an embodiment, other fields may be added in the
metadata for video conference.
[0176] According to an embodiment, the control method ma continue
to monitor whether the speaker corresponding to the human voice
included in the multimedia data changes from the current speaker
(e.g., the fifth user in the above-described example) to another
user (S250). In the example illustrated in FIG. 16, the speaker is
the fifth user U5. However, while the video conference is in
progress, the speaker may change into the fifth user U4 as shown in
FIG. 17. As such, the first electronic device 300 may keep
monitoring any change of the speaker.
[0177] The first electronic device 300 may determine whether the
speaker changes by analyzing the audio data included in the
multimedia data received from the electronic devices 200, 400, and
500. For example, the first electronic device 300 may determine
whether a human voice included in the audio data is identical to
the previous voice, and if not, may determine that the speaker has
changed.
[0178] If it is determined in step S250 that there is a change of
the speaker, the first electronic device 300 may identify the
changed speaker (S260). For convenience of description, a speaker
who is identified to speak next to the first speaker SP1 that first
started to speak after the video conference had commenced is
referred to as a "second speaker SP2". In the example illustrated
in FIGS. 16 and 17, the first speaker SP1 is the fifth user U5, and
the second speaker SP2 is the fourth user U4.
[0179] Step S260 may be performed by the same or substantially the
same method as step S220. To identify a speaker corresponding to
the sensed voice, the first electronic device 300 may identify
which electronic device has sent the multimedia data including the
voice among the electronic devices 200, 400, and 500, or may
analyze the video data included in the multimedia data, or may use
both identifying the video data included in the multimedia data and
analyzing the video data included in the multimedia data.
[0180] Similar to step S210 and/or S220, step S250 and/or S260 is
not necessarily performed b the first electronic device 300. Each
of the electronic devices 200, 400, and 500 may perform step S250
and/or S260.
[0181] Subsequently, the first electronic device 300 may obtain
information relating to the identified second speaker (S270) and
may store the obtained information so that the information
corresponds to the time point when the speaker changed (e.g., the
second time) (S280). Steps S270 and step S280 may be performed
identical or similar to steps S230 and S240.
[0182] Thereafter, the first electronic device 300 may repeatedly
perform steps S250 to S280. Accordingly, whenever the speaker
making a speech in the video conference changes, the first
electronic device 300 may obtain information on the changed speaker
and may store the information with the information corresponding to
the time when the change occurred. For example, when the person
speaking in the video conference changes from the fourth user U4 to
the seventh user U7 as shown in FIG. 18 and then to the second user
U2 as shown in FIG. 19, the first electronic device 300 may repeat
steps S250 to S280 and may store information relating to each
speaker.
[0183] Referring to FIG. 20, the first speaker, since the beginning
of the video conference, is Mike who is a manager in IP group and
starts to speak 15 seconds after the video conference has
commenced, Mike attends the video conference at the third place, a
keyword Mike comments is `Tele-presence Technology, the second
speaker is Pitt who is an assistant in IP group and begins to speak
2 minutes and 30 seconds after the video conference has begun, Pitt
attends the video conference at the third place, a keyword Pitt
comments is `Technology Trend`. Further, the third speaker is Jack
who is a chief research engineer in the first research center and
starts to make a speech 6 minutes and 40 seconds after the video
conference has begun. Jack attends the video conference at the
fourth place, and a keyword he comments is `Future Prediction`. The
fourth speaker is Giggs who is a senior research engineer in the
second research center and starts to speak 9 minutes and 50 seconds
after the video conference has commenced. Giggs attends the video
conference at the second place and a keyword he comments is
`Natural`.
[0184] As such, the first electronic device 300 may continue to
monitor the speakers during the course of the video conference and
may store various types of information on the speakers so that the
information corresponds to the time points when the speakers begin
to speak, thereby generating metadata for video conference.
Further, the video conference metadata is used in various manners,
thus enhancing user convenience. For example, the video conference
metadata may be used to make brief proceedings for the video
conference, which are to be provided to the attendees of the
conference or may be used to provide a search function which allows
the attendees to review the conference.
[0185] As described above, the first electronic device 300 may
output the received multimedia data in real time through the output
unit 150 while simultaneously generating the video conference
metadata. For example, the video data included in the multimedia
data may be displayed on the display unit 151, and the audio data
included in the multimedia data may be audibly output through the
sound output unit 152. While the received multimedia data is
displayed, the second multimedia data obtained for the second user
U2 and his surroundings directly by the camera 121 and/or
microphone 122 of the first electronic device 300 may be output as
well. The first electronic device 300 may display the whole video
data included in the multimedia data obtained by the first
electronic device 300 on the display unit 151 at once.
[0186] According to an embodiment, when identifying the current
speaker in step S210 and/or S260, the first electronic device 300
may identify a multimedia data clip including the identified
speaker among a plurality of multimedia data clips obtained and
transmitted by the electronic devices 200, 400, and 500, and may
output the identified multimedia data clip through the output unit
150 by a different method from output methods for the other
multimedia data clips.
[0187] For instance, the first electronic device 300 may display
the identified multimedia data clip (hereinafter, referred to as
"multimedia data clip for speaker") so that the speaker multimedia
data clip appears larger than the other multimedia data clips
(hereinafter, referred to as "multimedia data clips for listener").
For example, as shown in (b) of FIG. 16 to FIG. 19, the multimedia
data clip for speaker is displayed on the first region R1 in larger
size than the multimedia data clips for listener which are
displayed in smaller size on the second region R2. Referring to (b)
of FIG. 16, which illustrates that the speaker is the fifth user
U5, the first electronic device 300 may display a screen image S3
for the multimedia data clip including the fifth user U5 (e.g., the
multimedia data clip for speaker) on the first region R1 and may
display screen images S1, S4, and S2 for the remaining multimedia
data clips for listener on the second region R2. Referring to FIGS.
17 to 19, it can be seen that the screen images S3, S4, and S2 each
including the speaker are displayed on the first region R1.
[0188] As another example, the first electronic device 300 may
output both the video and audio data included in the multimedia
data clip for speaker, among the plurality of multimedia data
clips, through the sound output unit 152 while outputting only the
video data included in the multimedia data clips for listener
except for the audio data. For example, the whole video data
included in the plurality of multimedia data clips may be displayed
on the display unit 151 whereas only the audio data included in the
multimedia data clips for listener may be selectively output
through the sound output unit 152.
[0189] As still another example, the first electronic device 300
may output only the video and audio data corresponding to the
multimedia data clip for speaker among the plurality of multimedia
data clips through the output unit 150 while receiving the
multimedia data clips for listener from the electronic devices 200,
400, and 500 and storing the received data clips in the memory 160
without outputting the stored data clips through the display unit
151 or the sound output unit 152.
[0190] Although it has been described that the control method is
performed by the first electronic device 300 located at the second
place, the embodiments of the present invention are not limited
thereto. For example, according to an embodiment, the control
method may be performed by each of the electronic device 200
located at the first place, the electronic device 400 located at
the third place, and the electronic device 500 located at the
fourth place.
[0191] By identifying the speaker and outputting the multimedia
data clip corresponding to the speaker in a different manner than
those for the other multimedia data clips, more attention can be
oriented toward the user who is making a speech in the video
conference, thereby enabling the video conference to proceed more
efficiently.
[0192] Hereinafter, a method of controlling an electronic device
according to an embodiment of the present invention is described.
The metadata described above may be used to search for multimedia
data for video conference at specific times. For ease of
description, those described in connection with FIGS. 13 to 20 may
apply to the following embodiments. However, the control method
described below is not limited as conducted based on the video
conference metadata described above.
[0193] FIG. 21 is a flowchart illustrating a method of controlling
an electronic device according to an embodiment of the present
invention.
[0194] Referring to FIG. 21, the control method may include a step
of receiving a predetermined input (S300), a step of obtaining a
time point according to the received input (S310), and a step of
outputting multimedia data corresponding to the obtained time point
(S320). According to an embodiment, after or while the video
conference ends, specific time points of the multimedia data for
the video conference may be searched to review the video
conference, and the multimedia data corresponding to searched time
points may be output. Search conditions may be defined based on a
predetermined input. Each of the steps is described below in
greater detail.
[0195] The first electronic device 300 may receive a predetermined
input (S300). The predetermined input, which is provided to input
specific time points of the multimedia data for the video
conference stored in the first electronic device 300, may be
received by various methods. Any method to receive the search
conditions may be used for entry of the predetermined input.
[0196] FIG. 22 illustrates examples of receiving the predetermined
input by various methods according to an embodiment of the present
invention.
[0197] For example, referring to (a) of FIG. 22, when receiving the
predetermined input, the first electronic device 300 may display
input windows F1, F2, F3, F4, and F5 respectively corresponding to
various fields included in the video conference metadata, and a
user may perform the predetermined input through the input windows
F1, F2, F3, F4, and F5 using a predetermined input method (for
example, by using a keyboard), so that the first electronic device
300 may receive the predetermined input. As shown in (a) of FIG.
22, the user enters "Jack" in the input window corresponding to the
"Name" field of the video conference metadata as a search
condition.
[0198] As another example, the first electronic device 300 may
receive the predetermined input using a touch input method.
Referring to (b) of FIG. 22, the user U2 touches "Jack" (e.g., the
fourth user U4) included in the screen image corresponding to the
third place. Such touch enables "Jack" to be entered as a search
condition.
[0199] As still another example, the first electronic device 300
may receive the predetermined input using voice recognition.
Referring to (c) of FIG. 22, the user U2 generates a voice command
by saying "search Jack!" Then, "Jack" is entered as a search
condition.
[0200] According to an embodiment, a combination of the
above-described methods may be used to receive the predetermined
input. For example, if the user touches the input window F1
corresponding to the "Name" field followed by saying "search Jack!"
with the screen image displayed as in (a) of FIG. 22, the first
electronic device 300 may receive the predetermined input.
[0201] Subsequently, the first electronic device 300 may obtain
information on time point for the received input (S310). The time
information may be information for specifying a time point
determined on a time line counted since the video conference
begins. For example, according to an embodiment, the "time
information" described in connection with FIGS. 21 to 25 may be the
same or substantially the same as the time information (e.g., the
first and second time points) described in connection with FIGS. 13
to 20. For example, the first electronic device 300 may determine a
search condition through the user input predetermined in step S300
and may obtain the time information corresponding to the search
condition.
[0202] For example, in the case that the video conference metadata
is generated and stored as described in connection with FIGS. 13 to
20, the first electronic device 300 may receive a search condition
corresponding to "Jack" from a user through the predetermined user
input as described in connection with FIG. 22 and may extract
information on the time point mapped with information corresponding
to "Jack" from the video conference metadata. As such, the
information to be extracted is the time information.
[0203] Accordingly, the first electronic device 300 may output the
multimedia data corresponding to the obtained time point (S320).
For example, the first electronic device 300 may store the
multimedia data relating to the video conference and may call the
stored multimedia data and output the data through 152 and/or the
display unit 151 from the part corresponding to the time
information obtained in step S310.
[0204] If steps S300 and S310 are performed while the video
conference is on the go through the first electronic device 300,
step S320 may be conducted by various methods as follows.
Hereinafter, for convenience of description, the multimedia data
for video conference now in progress is referred to as "current
multimedia data", and the multimedia data corresponding to the time
information obtained in steps S300 and S310 is referred to as "past
multimedia data".
[0205] FIGS. 23 to 25 are views illustrating examples of outputting
the current and past multimedia data according to an embodiment of
the present invention.
[0206] According to an embodiment, the first electronic device 300
may display both video data included in the current multimedia data
and the video data included in the past multimedia data on the
display unit 151 and may output only the audio data included in the
current multimedia data through the sound output unit 152 without
outputting the audio data included in the past multimedia data.
Referring to FIG. 23, the screen image S5 for the current
multimedia data is displayed on the third region R3 of the display
unit 151, and the screen image S6 of the past multimedia data is
displayed on the fourth region R4 of the display unit 151. However,
the audio data included only in the current multimedia data but not
in the past multimedia data. In the case of that the current
multimedia data includes a plurality of multimedia data clips, the
screen image displayed on the third region R3 may correspond to a
multimedia data clip including the speaker currently speaking in
the conference among the plurality of multimedia data clips, while
the other multimedia data clips may be displayed on the second
region R2.
[0207] According to an embodiment, the first electronic device 300
may display the video data included in the current multimedia data
on a region of the display unit 151 and may output the audio data
included in the current multimedia data through the sound output
unit 152. The video data included in the past multimedia data is
not displayed, and text data converted from the audio data included
in the past multimedia data may be displayed on another region of
the display unit 151. Referring to FIG. 24, the video data included
in the current multimedia data is displayed on the third region R3
of the display unit 151, and the text data converted from the audio
data included in the past multimedia data is displayed on the
fourth region R4 of the display unit 151.
[0208] In the case that the current multimedia data includes a
plurality of multimedia data clips, the screen image displayed on
the third region R3 may correspond to a multimedia data clip
including the speaker currently speaking in the conference among
the plurality of multimedia data clips, while the other multimedia
data clips are displayed on the second region R2. Referring to FIG.
25, the video data included in the current multimedia data is
displayed on the second region R2 of the display unit 151, and text
data converted from the audio data included in the past multimedia
data is displayed on the fifth region R5 of the display unit 151.
According to an embodiment, the other multimedia data clips may be
also displayed on the second region R2, and the multimedia data
clip including the current speaker may be highlighted.
[0209] According to an embodiment, the first electronic device 300
may output the audio data included in the current multimedia data
through the sound output unit 152 while not outputting the audio
data included in the past multimedia data and may display the video
data included in the past multimedia data on the display unit 151
while not displaying the video data included in the current
multimedia data.
[0210] According to an embodiment, the first electronic device 300
may output the audio data included in the past multimedia data
through the sound output unit 152 while not outputting the audio
data included in the current multimedia data and may display the
video data included in the current multimedia data on the display
unit 151 while not displaying the video data included in the past
multimedia data.
[0211] Alternatively, the first electronic device 300 may output
the current and past multimedia data by various methods.
[0212] As such, after or while the video conference ends, specific
time points of the multimedia data for the video conference may be
searched to review the video conference, and the multimedia data
corresponding to the searched time points may be output.
[0213] In the methods of controlling an electronic device according
to the embodiments, each step is not necessary and according to an
embodiment, the steps may be selectively included therein. The
steps are not necessary to perform in the order described above,
and according to an embodiment, a later step may be performed
earlier than an earlier step.
[0214] The steps in the methods of controlling an electronic device
may be performed separately or in combination thereof. According to
an embodiment, steps in a method may be performed in combination
with steps in another method.
[0215] The methods of controlling an electronic device may be
stored in a computer readable medium in the form of codes or a
program for performing the methods.
[0216] The invention has been explained above with reference to
exemplary embodiments. It will be evident to those skilled in the
art that various modifications may be made thereto without
departing from the broader spirit and scope of the invention.
Further, although the invention has been described in the context
its implementation in particular environments and for particular
applications, those skilled in the art will recognize that the
present invention's usefulness is not limited thereto and that the
invention can be beneficially utilized in any number of
environments and implementations. The foregoing description and
drawings are, accordingly, to be regarded in an illustrative rather
than a restrictive sense.
* * * * *