Spaces:

Pamudu13
/

automatedblogpostcreater

Sleeping

File size: 20,629 Bytes

53e65b7

import requests
from typing import Dict, List
import re
import json
from web_scraper import get_cover_image

class BlogGenerator:
    def __init__(self, openai_key: str, openrouter_key: str, serpapi_key: str = None):
        self.openai_key = openai_key
        self.openrouter_key = openrouter_key
        # serpapi_key is now optional since we're using our own image search

    def get_cover_image(self, title: str) -> str:
        """Get a cover image URL for the blog post"""
        try:
            # Use our custom image search function
            image_url = get_cover_image(title)
            if not image_url:
                print("No image found, trying with modified query...")
                # Try again with a more generic query if specific title fails
                image_url = get_cover_image(title + " high quality cover image")
            return image_url
        except Exception as e:
            print(f"Error in get_cover_image: {e}")
            return None

    def create_detailed_plan(self, cluster_data: Dict, preliminary_plan: str, research: str) -> str:
        try:
            response = requests.post(
                'https://openrouter.ai/api/v1/chat/completions',
                headers={
                    'Authorization': f'Bearer {self.openrouter_key}',
                    'HTTP-Referer': 'http://localhost:5001',
                    'X-Title': 'Blog Generator'
                },
                json={
                    'model': 'google/gemini-2.0-flash-thinking-exp:free',
                    'messages': [{
                        'role': 'user',
                        'content': f"""You are part of a team that creates world class blog posts. 

For each new blog post project, you are provided with a list of keywords, a primary keyword, search intent, research findings, and a preliminary blog post plan. Here's a definition of each of the inputs: 

- Keywords: These are the keywords which the blog post is meant to rank for on SEO. They should be scattered throughout the blog post intelligently to help with SEO. 

- Search intent: The search intent recognises the intent of the user when searching up the keyword. Our goal is to optimise the blog post to be highly relevant and valuable to the user, as such the search intent should be satisfied within the blog post. 

- Research findings: This is research found from very reputable resources in relation to the blog post. You must intelligently use this research to make your blog post more reputable. 

- Preliminary plan: A very basic plan set out by your colleague to kick off the blog post. 

- Primary keyword: Out of the keywords, there is one keyword known as the primary keyword. The primary keyword is the keyword which has the highest SEO importance and as such must go in the title and first few sentences of the blog post. It is important that the blog post is highly relevant to the primary keyword, so that it could be placed naturally into the title and introduction sections. 

Given the above info, you must create a detailed plan for the blog post. 

Your output must: 

- Include a plan for the blog post.
- Be in dot point format.
- In each part of the blog post, you must mention which keywords should be placed. 
- All keywords must be placed inside the blog post. For each section, mention which keywords to include. The keyword placement must feel natural and must make sense. 
- You must include all research points in the blog post. When including the research points, make sure to also include their source URL so that the copywriter can use them as hyperlinks. 
- Your plan must satisfy the search intent and revolve directly around the given keywords. 
- Your plan must be very detailed. 
- Keep in mind the copywriter that will use your plan to write the blog post is not an expert in the topic of the blog post. So you should give them all the detail required so they can just turn it into nicely formatted paragraphs. For example, instead of saying "define X", you must have "define X as ...". 
- The plan must have a flow that makes sense. 
- Ensure the blog post will be highly detailed and satisfies the most important concepts regarding the topic. 

A new project has just come across your desk with the following details:

Keywords: {cluster_data['Keywords']}
Primary keyword: {cluster_data['Primary Keyword']}
Search intent: {cluster_data['Intent']}
Preliminary plan: {preliminary_plan}
Research findings: {research}

Create the detailed plan."""
                    }]
                },
                timeout=60
            )
            
            if response.status_code != 200:
                raise Exception(f"OpenRouter API error: {response.text}")
            
            response_data = response.json()
            if 'choices' not in response_data:
                raise Exception(f"Unexpected API response format: {response_data}")
            
            return response_data['choices'][0]['message']['content']
        except Exception as e:
            print(f"Error in create_detailed_plan: {e}")
            raise

    def write_blog_post(self, detailed_plan: str, cluster_data: Dict) -> str:
        try:
            response = requests.post(
                'https://openrouter.ai/api/v1/chat/completions',
                headers={
                    'Authorization': f'Bearer {self.openrouter_key}',
                    'HTTP-Referer': 'http://localhost:5001',
                    'X-Title': 'Blog Generator'
                },
                json={
                    'model': 'google/gemini-2.0-flash-thinking-exp:free',
                    'messages': [{
                        'role': 'user',
                        'content': f"""You are part of a team that creates world-class blog posts.

You are the teams best copywriter and are responsible for writing out the actual blog post.

For each new blog post project you are provided with a detailed plan and research findings.

Your job is to create the blog post by closely following the detailed plan.

The blog post you create must:
- Follow the plan bit by bit
- Use short paragraphs
- Use bullet points and subheadings with keywords where appropriate
- Not have any fluff. The content must be value dense and direct
- Be very detailed
- Include the keywords mentioned in each section within that section
- Place the primary keyword in the blog title, H1 header and early in the introduction
- Place one keyword for each section in the heading of that section
- When possible pepper synonyms of the keywords throughout each section
- When possible use Latent Semantic Indexing (LSI) keywords and related terms
- Be at minimum 2000 to 2500 words long
- Be suitable for a year 5 reading level

Make sure to create the entire blog post draft in your first output. Don't stop or cut it short.

Here are the details for your next blog post:

Keywords: {cluster_data['Keywords']}
Primary keyword: {cluster_data['Primary Keyword']}
Search intent: {cluster_data['Intent']}
Detailed plan: {detailed_plan}

Write the blog post."""
                    }]
                },
                timeout=120
            )
            
            if response.status_code != 200:
                raise Exception(f"OpenRouter API error: {response.text}")
            
            response_data = response.json()
            if 'choices' not in response_data:
                raise Exception(f"Unexpected API response format: {response_data}")
            
            return response_data['choices'][0]['message']['content']
        except Exception as e:
            print(f"Error in write_blog_post: {e}")
            raise

    def add_internal_links(self, blog_content: str, previous_posts: List[Dict]) -> str:
        try:
            response = requests.post(
                # 'https://api.openai.com/v1/chat/completions',
                # headers={
                #     'Authorization': f'Bearer {self.openai_key}',
                #     'Content-Type': 'application/json'
                # },
                # json={
                #     'model': 'gpt-4',
                  'https://openrouter.ai/api/v1/chat/completions',
                headers={
                    'Authorization': f'Bearer {self.openrouter_key}',
                    'HTTP-Referer': 'http://localhost:5001',
                    'X-Title': 'Blog Generator'
                },
                json={
                    'model': 'google/gemini-2.0-flash-thinking-exp:free',
                    'messages': [{
                        'role': 'user',
                        'content': f"""You are part of a team that creates world class blog posts.

You are in charge of internal linking between blog posts.

For each new blog post that comes across your desk, your job is to look through previously posted blogs and make at least 5 internal links.

To choose the best internal linking opportunities you must:
- Read the previous blog post summaries and look through their keywords. If there is a match where the previous blog post is highly relevant, then this is an internal linking opportunity.
- Do not link if it is not highly relevant. Only make a link if it makes sense and adds value for the reader.

Once you've found the best linking opportunities, you must update the blog post with the internal links. To do this you must:
- Add the link of the previous blog post at the relevant section of the new blog post. Drop the URL at the place which makes most sense. Later we will hyperlink the URL to the word in the blog post which it is placed next to. So your placing is very important.

Make sure to:
- Not delete any existing URLs or change anything about the blog post
- Only add new internal linking URLs
- Place URLs next to relevant anchor text
- Add at least 5 internal links if possible
- Only link when truly relevant and valuable
- Preserve all original content and formatting

Current blog post:
{blog_content}

Previous blog posts:
{json.dumps(previous_posts, indent=2)}

Your output must be the complete blog post with new internal links added. Don't summarize or modify the content - just add the URLs in appropriate places."""
                    }]
                },
                timeout=120
            )
            
            if response.status_code != 200:
                raise Exception(f"OpenAI API error: {response.text}")
            
            response_data = response.json()
            if 'choices' not in response_data:
                raise Exception(f"Unexpected API response format: {response_data}")
            
            return response_data['choices'][0]['message']['content']
        except Exception as e:
            print(f"Error in add_internal_links: {e}")
            # If there's an error, return the original content
            return blog_content

    def convert_to_html(self, blog_content: str, image_url: str = None) -> str:
        try:
            # First replace newlines with <br> tags
            formatted_content = blog_content.replace('\n', '<br>')
            
            # Add image URL to the prompt
            image_instruction = ""
            if image_url:
                image_instruction = f"Add this image at the top of the post after the title: {image_url}"
            
            response = requests.post(
                # 'https://api.openai.com/v1/chat/completions',
                # headers={
                #     'Authorization': f'Bearer {self.openai_key}',
                #     'Content-Type': 'application/json'
                # },
                # json={
                #     'model': 'anthropic/claude-3.5-sonnet:beta',
                  'https://openrouter.ai/api/v1/chat/completions',
                headers={
                    'Authorization': f'Bearer {self.openrouter_key}',
                    'HTTP-Referer': 'http://localhost:5001',
                    'X-Title': 'Blog Generator'
                },
                json={
                    'model': 'google/gemini-2.0-flash-thinking-exp:free',
                    'messages': [{
                        'role': 'user',
                        'content': f"""DO NOT OUTPUT ANYTHING OTHER THAN THE HTML CODE. Follow this layout template to generate WordPress code for a blog post:

The blog post should have:
- Title
{f"- Featured image: <img src='{image_url}' alt='Featured image' style='width: 100%; height: auto; margin: 20px 0;'>" if image_url else ""}
- Estimated reading time 
- Key takeaways
- Table of contents
- Body
- FAQ

Rules:
- Make it engaging using italics, dot points, quotes, bold, spaces, and new lines. No emojis.
- Hyperlink any referenced URLs to their adjacent keyphrases
- Wrap content in container <div> with inline CSS for white text (#000000), Arial/sans-serif font, 1.6 line height
- Set non-heading text to 20px and white (#000000) with !important
- Style links, TOC points, and FAQ questions in blue (#00c2ff)
- Add blue (#00c2ff) underline border to headings with padding
- Add double breaks (<br>) between sections
- Output only the HTML code, no extra text

Blog post content:
{formatted_content}"""
                    }]
                },
                timeout=120
            )
            
            if response.status_code != 200:
                raise Exception(f"OpenAI API error: {response.text}")
            
            response_data = response.json()
            if 'choices' not in response_data:
                raise Exception(f"Unexpected API response format: {response_data}")
            blog_content1 = response_data['choices'][0]['message']['content']
            formatted_content1 = blog_content1.replace('\n', '')
            return formatted_content1
        except Exception as e:
            print(f"Error in convert_to_html: {e}")
            # If there's an error, return the original content
            return blog_content

    def generate_metadata(self, blog_content: str, primary_keyword: str, cluster_data: Dict) -> Dict:
        try:
            # Generate slug
            slug_response = requests.post(
                # 'https://api.openai.com/v1/chat/completions',
                # headers={
                #     'Authorization': f'Bearer {self.openai_key}',
                #     'Content-Type': 'application/json'
                # },
                # json={
                #     'model': 'gpt-4',
                  'https://openrouter.ai/api/v1/chat/completions',
                headers={
                    'Authorization': f'Bearer {self.openrouter_key}',
                    'HTTP-Referer': 'http://localhost:5001',
                    'X-Title': 'Blog Generator'
                },
                json={
                    'model': 'google/gemini-2.0-flash-thinking-exp:free',
                    'messages': [{
                        'role': 'user',
                        'content': f"""Create a slug for the following blog post:

{blog_content}

A slug in a blog post is the part of the URL that comes after the domain name and identifies a specific page. It is typically a short, descriptive phrase that summarizes the content of the post, making it easier for users and search engines to understand what the page is about. For example, in the URL www.example.com/intelligent-agents, the slug is intelligent-agents. A good slug is concise, contains relevant keywords, and avoids unnecessary words to improve readability and SEO.

The slug must be 4 or 5 words max and must include the primary keyword of the blog post which is {primary_keyword}.

Your output must be the slug and nothing else so that I can copy and paste your output and put it at the end of my blog post URL to post it right away."""
                    }]
                },
                timeout=60
            )
            
            if slug_response.status_code != 200:
                raise Exception(f"OpenAI API error: {slug_response.text}")
            
            slug_data = slug_response.json()
            if 'choices' not in slug_data:
                raise Exception(f"Unexpected API response format: {slug_data}")
            
            slug = slug_data['choices'][0]['message']['content'].strip().lower()

            # Generate title
            title_response = requests.post(
                # 'https://api.openai.com/v1/chat/completions',
                # headers={
                #     'Authorization': f'Bearer {self.openai_key}',
                #     'Content-Type': 'application/json'
                # },
                # json={
                #     'model': 'gpt-4',
                  'https://openrouter.ai/api/v1/chat/completions',
                headers={
                    'Authorization': f'Bearer {self.openrouter_key}',
                    'HTTP-Referer': 'http://localhost:5001',
                    'X-Title': 'Blog Generator'
                },
                json={
                    'model': 'google/gemini-2.0-flash-thinking-exp:free',
                    'messages': [{
                        'role': 'user',
                        'content': f"""Extract the blog post title from the following blog post:

{blog_content}

The blog post title must include the primary keyword {primary_keyword} and must inform the users right away of what they can expect from reading the blog post.

- Don't put the output in "". The output should just text with no markdown or formatting.

Your output must only be the blog post title and nothing else."""
                    }]
                },
                timeout=60
            )
            
            if title_response.status_code != 200:
                raise Exception(f"OpenAI API error: {title_response.text}")
            
            title_data = title_response.json()
            if 'choices' not in title_data:
                raise Exception(f"Unexpected API response format: {title_data}")
            
            title = title_data['choices'][0]['message']['content'].strip()

            # Generate meta description
            meta_desc_response = requests.post(
                # 'https://api.openai.com/v1/chat/completions',
                # headers={
                #     'Authorization': f'Bearer {self.openai_key}',
                #     'Content-Type': 'application/json'
                # },
                # json={
                #     'model': 'gpt-4',
                  'https://openrouter.ai/api/v1/chat/completions',
                headers={
                    'Authorization': f'Bearer {self.openrouter_key}',
                    'HTTP-Referer': 'http://localhost:5001',
                    'X-Title': 'Blog Generator'
                },
                json={
                    'model': 'google/gemini-2.0-flash-thinking-exp:free',
                    'messages': [{
                        'role': 'user',
                        'content': f"""Create a proper meta description for the following blog post:

{blog_content}

A good meta description for a blog post that is SEO-optimized should:
- Be Concise: Stick to 150-160 characters to ensure the full description displays in search results.
- Include Keywords: Incorporate primary keywords naturally to improve visibility and relevance to search queries.

Primary keyword = {primary_keyword}

More keywords to include if possible = [{cluster_data['Keywords']}]

- Provide Value: Clearly describe what the reader will learn or gain by clicking the link.
- Be Engaging: Use persuasive language, such as action verbs or a question, to encourage clicks.
- Align with Content: Ensure the description accurately reflects the blog post to meet user expectations and reduce bounce rates.

Your output must only be the meta description and nothing else."""
                    }]
                },
                timeout=60
            )
            
            if meta_desc_response.status_code != 200:
                raise Exception(f"OpenAI API error: {meta_desc_response.text}")
            
            meta_desc_data = meta_desc_response.json()
            if 'choices' not in meta_desc_data:
                raise Exception(f"Unexpected API response format: {meta_desc_data}")
            
            meta_desc = meta_desc_data['choices'][0]['message']['content'].strip()

            # Validate the results
            if not title or not meta_desc or not slug:
                raise Exception("Empty title, meta description, or slug")

            return {
                'slug': slug,
                'title': title,
                'meta_description': meta_desc
            }
        except Exception as e:
            print(f"Error in generate_metadata: {e}")
            raise