What we can do with config right other than using it for running stable defeat,
I have collected almost 30 different use case things like image.txt,
how to create sound effects direct from one image and some other ones are just as a super functions, some effects, filters or image enhancements.
Quickly I will show all of the things I found myself using for the last month and how we can make your workflow
even more complex with all of these functions.
So you don't need to go any other software other than Homevii.
If you don't know what lava is, it's basically image to text model that can understand what is happening in the image.
And you can ask questions about that image.
So we have our main lava model here.
You need to download blue models to be able to use it.
And we can write our prompt in this part, connect our image.
let's say we want to use this image and we can say things like describe the
location of the image and what is happening and if we run it
We will get something like the image feature that's here in the scene with two small wooden cubbies,
which is on the skills of the forest.
They are positioned next to each other, overlooking a picturesque lake or pond.
It's pretty accurate, and we can also ask some other more detailed questions, like maybe what are the buildings made out of.
or we can maybe try with some different image as well there is really no limit to what you can ask.
Let's put this image now and say maybe what is the style of it in the image it as your own.
The structure with many windows and unique design, it requires to be residential or commercial building possible to allow this space.
And the building and it get cut,
because the maximum token is set as 40, we can increase it to 300 tokens and then it will or it may create longer outputs.
It's changed 200, apparently 200 is the maximum one.
So we get a way longer output and with the temperature you can say how creative the model should be.
So this is Let's go to our second workflow.
In our second module I collected a of different ways which you can use to remove background of different objects
from any image you want to use.
In this case I will use this picture to remove In the first two workflows,
we don't have control to which object to keep and which object to remove.
It is kind of understanding on its own what is the main element in the image and then removes the background for that object.
In the second one, we have a of different models that we can choose.
Some of them are for general purposes as you can see and for example, this one focused on the human segmentation which can be nice.
Let's keep this one for now.
And the third one is slightly different than the first two,
because in this one we can actually prompt what we want to keep in the image,
let's say we want to keep this armchair,
and then if we generate In the first one,
it decided to keep the armchair and this part of coffee table and the frame at the background.
In the second one, just the armchair with this piece of wood here.
And in the third one, it tried to only keep the armchair, but it also removed this part.
We can try to fix this with the threshold value.
and also we can try to prompt it more at more detail but as we can see the
quality wise is not as good as the first two but it is more flexible so it
is up to you in which use cases which one may Let's try to keep the poster on the wall.
One of the main reasons why I really like Comfyi is because it's just an empty canvas or tool for us.
You can do almost everything.
If you want to use it, take notes or create mood boards, it's not really possible to do with Comfyi.
But its brother secrintel can help you with that.
Secrintel is a visual note taking platform where you can place cards on your board and connect ideas together to document them better.
Inside each card we can add list, images, PDFs, videos and a few more options as well.
They have lots of templates for different kind of use cases,
like for research about a blog post, for example, or how we can take lecture notes, depending on the different topics.
Here are my notes for this video on the board.
You see it's really similar to Pongvui and super flexible in terms of what you can do with it.
I have placed all of the resources, materials, and custom extensions I used in the video.
Here on this screen tool board, you can find the link in the video description.
Thank you screen tool for sponsoring this section of the video.
If you want to try it out, you can use the Design 10 code for discount.
Okay, let's continue our third module, which is video to mask how we can remove the back
So in the first note we have our load video component which we can choose here to load our video in this case
I have a video of this guy dancing so we can take my game from the back In this part,
we can choose the frame rate.
We give it 24 or we can reduce it or make it higher.
And we can choose a limit of frames.
Let's say you don't want to wait for the whole thing until when you are testing.
So we can set it like a 30 frames.
So it will only generate the first 30 frames of the video.
And then we are removing the background for all of the frames.
In this case, instead of Ethernet, this is the one we use previously.
use unit for the human segmentation model and then we gonna merge all of
the frames back to create our video so let's run it and of course depending on the
length of the video this will take considerably longer because we are doing the same process for the each frame In this case,
we're going to do it 30 times.
You can see our guy completely removed from the background and also the iPod channel version of it.
If you want to use it to maybe run through controller to create animation videos.
Since this is only the first 30 frames, it's not super long.
Let's try maybe like a 90 frames with 15 FPS, so we can do the whole video.
In this one, we can see all of the individual frames that generated.
And we end up with a video like this one.
total flexibility on the settings about the FPS and the frames and how you want to segment it, basically.
Let's continue our board module, which is the LLM part, our text generator part.
I wanted to show a couple of different workflows for running different type of LLM models.
The first one is running totally locally on your computer or if you're using a server on your server.
In this case, we're using this LLM notes, extensions.
We can easily install any models and then we can choose which model we want to use.
I have four of them installed right now.
And one is trained specially for stable efficient prompts.
It is not super cool, but I can see how you might want to use it.
We have two prompt options, one of them is the system prompt, and the other one is a normal prompt we want to write.
In the system prompt you can specify what is the purpose.
what type of things you are trying and then here we can say maybe things like
create a prompt for building in desert covered with sense.
Here, again, a similar to lava, we have a of options to give the maximum tokens, the and a of other settings.
I mean, it's a bunch of different things happening here, but I think it's a pretty decent one from model running locally.
Let's try another one, maybe a mixed result when we get a prompt, like it actually put image generator prompt.
It's we can totally use it.
So let's go to our second option.
Our note is called generate stable diffusion prompt with LLM.
This one is a bit more different because we are actually using another service to run our LLM.
So it's not running locally,
but this one is using a platform In the parameters,
the last one we can choose which model we want to use,
and actually there are lots of them that we can try,
like the one we used previously,
and they even have GPT 3.5 turbo,
Gemini Pro, Cloudy, etc, and some of them are saying like we run these ones, and I think the rest...
you have some kind of limit, set it to your account.
So far, I'm able to use it without any issues.
So you can try to check it.
All you need to do in this platform, a unit copy API, and go to a config file, you need to paste your key here.
After that, it should again, we have the our system prompts here.
And the top, we can type our prompt.
So I will just use the same ones we used here.
And let's try, for example, one of three ones, missed rule, and run get our prompt.
What we get here is the same model.
And Let's try to, for example, GPT 3.5 toolbar.
We get our respond back, the art style realism.
I think because of this it's creating a bit this categorize here so maybe let's remove it and send it again and now with the
update prompt we get a detailed a nice prompt you may think like why i should use gpt here instead
of the chat version directly i see all of these extensions like a component or for like a complete
workflow so if you have all of these components like blava which you can get information from
your images or llamps or background removes and other things we can
become a tool that you can connect them together and create something way more complex.
And in the last one, we have a node from MixLab, the chatgpt node.
Here we can directly place your API key that you can get from OpenAI,
and then basically use GPT 3.5, 1.5 turbo, 4 4, the newest version of our model directly.
here, you can get your API key from the OpenAI accounts page and then place it here, then
write your prompt here and then send it.
It should work without any issues.
Our LLM part are this locally on a third-party server and then directly from OpenAI.
Let's go one more step to upscale enhancer for flow.
The one on the top image contrast adapt to sharpening,
it's a pretty simple and nice component I guess it just removes the blur on top of the image and adds more texture but not super
high to make it obvious that it's been sharpened so it's a nice one and in the downside we have our upscale image using model.
model that you can find and use,
like for example real escarne are some of the most popular ones or this full hardy one or ultra sharp.
let's use ultra sharp model in this one and run and I added this image compiler
notes so we can see the difference between the before and after versions so
first let's go check the sharpening one if you zoom in a bit we can see the details
a bit more playly for example the right side is before and after of course there is not much super dramatorex But,
I think sometimes it's worth to do this.
We can see the blue is disappearing without changing the image at all.
And in the down one is the change should be a bit more dramatic because we are upscaling the image four times.
And we can see from right to left the upscaling and the normal version,
the one on the left is upscaled and the right is the normal one.
And this upscaling version is not super crazy and good one,
it's just a simple quick one to use maybe in between the workflow,
between different steps, not so great to use it as a final upscaler because this is not adding any more details to the image.
In the next module we have some image filters that we can use to add some different touches to our images.
The is a channel shape which creates this type of effect with moving the RGB channels of the image.
We set up the distance and the angle you want to shake the channels.
We get an image like this All of these modules,
you can think they are more similar to like a Photoshop filter effect functions than nothing like a super crazy image generation ones.
But I think they are good to have in a like open source platform like this.
So we can do many different things.
I can imagine configuring becoming a tool like Grasshopper with lots of different possibilities to do.
But of course focus more on the AI tools.
for none coders and the next one is we have a filter for watercolor,
the motion blue And this one is a bit more interesting because this creates a depth map and then it applies a depth
balloon and we can choose where to apply the balloon basically how close or how far away this
effect is better to show you know something like this image I think but let's go check all the other
ones and the watercolor gives this type of effect.
of course we are not actually generating a version of the same image with watercolor style.
It's more about the filters and the edge lines around the objects.
And one is a nice motion blur with a horizontal direction.
So let's try this one for the depth image.
I want to focus on the far spot.
So I will set this one as zero.
First, we will create a depth map of our image and then use it to apply.
So as you can see, these parts which are closer to us get blurred and we focus on the further wall like this.
And of course, you can change it to the...
So the first part will become blurry and we focused on this part.
I think this is also a good filter to use, especially for portrait images.
So maybe let's say we have something like this image and we want to blur the background.
Maybe this is too much blur,
we can also control how strong the blur should be, let's reduce it a bit and run it again, smoother.
Or the other way around, we will learn the girl and then the background is.
It's changed image and I will turn off the sports.
Last one is a bit more different because in this one we are adapting the color of the original image we want to use.
In this case this one to match it with a reference image and let's run to see what type of effect we won't get it with
these colors so it changed to like this.
Of course it is up to you if you like it or not but I think it's a really quick way to change the image.
Let's try this other full one and see how it will affect the image.
I can't see the effect but it's a bit too much so let's reduce it and run it again.
So we had like a color correction on
So these were more dramatic filters like a channel shake or the watercolor motion below or depth below or the color adaption.
In the next one we have more simpler enhancement features.
The first one is the filter adjustments which you can change the brightness, the saturation, sharpness.
You can add some normal is basically something similar to sharpening.
Or the second one is apply some film grain on the top of the image.
or the third one is if you have some looks that you can use to adjust the colors, adjust the style of the image.
You can upload a bunch of them to your confia folder and then choose there to apply some effects.
I found these ones free online that I will also attach to the video description so you can get the same ones.
Let's run it to see all of their effects.
usually what I do is I add like the adjustment field at the end of my workflow to fix the final
image once I like it or if you want to keep the maybe same color style along the different
images you generate you can apply the same look to all of them to have a similar look.
The one is not so dramatic change.
As said, it's like adjusting the contrast of it and increasing the saturation.
The second one is more about the limb grain on top of it.
So in this part, it's a bit more clear.
And the third one is the loot,
this is more change compared to first two,
this one is for the vibrant color, maybe like the same change to a warm filter and run it.
Then we end up an image like this one.
Let's continue to now more exciting workflow here We have text audio and image to audio where we combine a couple of different modes we covered so far
Let's do the first one basically in this one.
We have audio LDM model generative AI model for creating out views from text
You can give it a prompt here and it will generate the audio for right
now We set us 10 seconds which we have similar settings like CFG, guidance, scale, the seat number and extension type.
We have a prompt city life here.
Maybe let's change it to fish shop on a busy Sunday.
People are talking Let's run it, then we get a sound effect like this.
This is super cool on its own, but it becomes even more exciting for me when we combine like three models.
in this one we have our lava image to text model and to describe our image.
The I want to do here is image to audio so we can maybe
image to video first and then at the same time create image to audio and
then combine them to create a nice video with sound effects on its own so we have a
render like this and I'm passing to lava describe image and it is location and
environment and then we are using the answer we get from lava to feed it as a prompt to our local LLM,
and we are using as a system prompt your NEDVA sound effect producer AI suggest the possible sounds for this space,
suggest one sentence long prompt for the sound effects,
and we are using the answer we get from our chatbot, the to our audio generative models.
So see what we end up with.
First law is the scene instead of being the park people walking around and enjoying the outdoors.
There are several individuals in the area, some of them closer, some of them further away.
And then Our LLM Mistral model said create a layered soundscape of n-b-in chatter,
this natural and casual footsteps on various surfaces such as gravel pavement grass and less here.
I think it was pretty decent and I can totally see these sounds from this image.
So let's try another image.
Let's for this interior kitchen view.
I wonder what type of sound effect our chatbot will suggest for this space.
Sound footsteps approaching and stopping in the large modern kitchen followed by sound of chairs being pulled out and placed back under the kitchen island.
I'm pretty happy with it to be honest,
I wasn't expecting a prompt like this for the sound effect but let's see how good it will try to follow it in our sound effect.
It's not bad, but also not super dramatic, because there is not much happening.
We just put steps in an interior room.
But I think you get the idea and you see the potential of it, how we can use it.
So let's continue our next one.
This one is similar to the layer effects inside Photoshop, which we can create some drop shadow on some specific object.
In this image we are getting the software and remove background and then placing back with some drop shadow.
Of course it is up to you after this point how you use it or we can add a stroke around our object.
apply out or grow or use the image opacity,
this is the image opacity reducer, nothing super fancy, but sometimes to blend different elements to each other, it might be a nice option.
I'm using this color as a background color for now so we can see the outer glow is like
this one we can change all the colors, the light color, the glow color and the brightness.
This one we add our stroke around our object with a specific color in this case red and our drop.
This one is more like a color palette generator from an image so maybe we can use to create some type of mood board.
The first one is we are getting the main color from the image.
The second one we And third one, we are getting the color palette, but it's like a pixelated version of it.
So we can select the blues here, representing the building, the greens are here.
So according to the location in the image.
And the last one, we are saying you want to get nine colors as a color palette from this image.
And these are the color palette used in this image.
Let's test the animation.
and we can see what's the most dominant color,
what is the average of them,
and then color palette one for the placement within the image, and one in general color palette, which is pretty accurate, I think.
Now we have our image to 3D generator.
This one is using the table 0123 model.
And then we are basically creating six images from different angles of this space.
You can also use a 3D viewer component here and then be able to see it in 3D.
This is more on the experimental side because especially in a picture like this,
in a like architectural design, I don't think it's super useful yet.
But definitely we are going there, especially after the new video models.
get a grid of images like this and if we checked all of them one by one,
this is like a top view of the space.
I it's pretty accurate but of course the background is nothing happening because we don't have any information about that part but the part we see is pretty accurate.
This is from the behind I guess.
Also this is from the other side,
again it's pretty cool that it applied the same pattern of the facade also in this part from the behind and on the side.
I totally see some use cases for object design or like a character design, but in this case it's more on the experimental side.
This is simpler workflow just to remove some objects directly with an given image and given mask as a location of the image.
In this one, it's not like a common in painting workflow, it's stable diffusion.
We are not writing any prompt, any other settings.
So if we click right and an open in mask editor, let's try to remove one of the windows for example.
Let's say you want to remove this one, this one.
Let's say there is like a weird line here.
Let's remove that one as well and save it.
it will remove these windows for us and try to blend it with the rest of the image,
which in this case is pretty cool, except which you can see some mistakes.
So let's try to remove one of the chairs.
It did a pretty good job in this case.
I'm a bit surprised to be honest.
Let's copy this one, place it here, so we can try to remove something else, test it.
Let's remove this light in the middle.
So maybe let's remove this plant here, give it a more challenging task.
See how it will go Again, it's still good In a couple of steps, you can change this image to a version like this.
The last workflow I want to show is how we can write text directly on top of the image inside
Comre UI and then combine them to create a grid view.
Maybe you want to do like some comparison view or show a couple of images together.
Let's say we have two versions of the same design.
One like a black concrete like this one and the other one is with timber facade and we want to write like a layer material on top of the image
Let's write the number for this one and this one on grid.
So we are using two different text creators,
both of them are really similar,
the most different between them is one of them you can set up some margins and line
spacing etc and in the other one you can directly add text.
In both of them we can choose some different fonts or we can also just upload your own fonts that you want to use.
Let me turn off this part first.
you can see where it's written as a text on itself and on top of the image so this is
how we can type text directly on top of the image I prefer using this one draw
text component which is a bit easier to use compared to this
one and later so you can decide where this text should go on top of the image from these letters.
Right now we are saying 10 pixels to right and 15 pixels to down so if we move this one the text should go more into the middle side like this and same as for the Y.
and once you are done with both of them let's combine them using first image the
patch and then create image grid which we can set up the border color border
thickness how many columns you want to And then once we combine all of them,
we should get a grid with our aim, just in two columns, like this one.
So these are all of the workflows I wanted to share with you.
This is a slightly different video, trying to show what are the things you can do inside Comfy, right?
It is not just a stable diffusion user interface, but much more than that.
You totally see this becoming a tool on its own,
not just for stable diffusion, but with all the new open-source developments around AI and similar tools.
If are a non-coder, this is an amazing tool to use almost the full potential of these new technologies and their new models.
So I definitely suggest you to...
this one I have prepared so far two different videos how we can
install locally on their competitor with a single click installation and also
same for the robot how we can use on the cloud if you don't have a competitor
these are the two videos so feel free to check them if you want to learn how to
install them otherwise thanks for watching up to this point I hope you
liked the video let me know which one of the workflows was your favorite one and which
one you think you will use I will include all of the workflows and the
necessary extensions tools models in the video the You find the direct template on my Patreon page,
so thank you for your support and see you in the next video.